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METHOD FOR ISOLATION OF BIOSYNTHESIS GENES 
FOR BIOACTIVE MOLECULES 

DESCRIPTION 

BACKGROUND OF THE INVENTION 
This application relates to a method for the isolation of biosynthesis genes for 
antibiotics and other bioactive molecules from complex natural sources such as humus, soil 
and lichens. 

5 Antibiotics play an important role in man's efforts to combat disease and other 

economically detrimental effects of microorganisms. Traditionally, antibiotics have been 
identified by screening microorganisms, especially those found naturally in soil, for their 
ability to produce an antimicrobial substance. In some cases, the gene or genes responsible 
for antibiotic synthesis have then been identified and cloned into producer organisms which 
10 produce the antibiotic in an unregulated manner for commercial applications. However, it 
has been estimated that less than 1% of the microorganisms present in soil are culturable. 
Torsvik et al., Appl, Environ. Microbiol, 56: 782-787 (1990). Thus, much of the genetic 
diversity potentially available in soil microorganisms is unavailable through traditional 
techniques. 

15 As pathogenic microorganisms become increasingly resistant to known 

antibiotics, it would, however, be highly desirable to be able to access the reservoir of genetic 
diversity found in soil, and to facilitate the exploration of new species of antibiotics which 
may be made by the vast numbers of unculturable organisms found there. It would further be 
desirable to have access to novel biosynthetic enzymes and the genes encoding such enzymes, 

20 which could be used in recombinant organisms for antibiotic production or for in vitro 

enzymatic synthesis of desirable compounds. Thus, it is an object of the present invention to 
provide a method and compositions for isolating DNA and DNA fragments encoding 
enzymes relevant to the production of pharmaceutically active molecules such as antibiotic 
biosynthesis enzymes. 



25 
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SIJMMARY OF THE INVENTION 
We have now identified degenerate primers which hybridize with various 
classes of antibiotic biosynthesis genes, and have used such primers to amplify fragments of 
DNA from soil and lichen extracts. Cloning and sequencing of the amplified products 
5 showed that these products included a variety of novel and previously uncharacterized 

antibiotic biosynthesis gene sequences, the products of which have the potential to be active 
as antibiotics, immunosuppressors, antitumor agents, etc. Thus, antibiotic biosynthesis genes 
can be recovered from soil by a method in accordance with the present invention comprising 
the steps of: 

10 (a) combining a soil-derived sample with a pair of amplification primers 

under conditions suitable for polymerase chain reaction amplification, wherein the primer set 
is a degenerate primer set selected to hybridize with conserved regions of known antibiotic 
biosynthetic pathway genes, for example Type I and Type II polyketide synthase genes, 
isopenicillin N synthase genes, and peptide synthetase genes; 

1 5 (b) cycling the combined sample through a plurality of amplification 

cycles to amplify DNA complementary to the primer set; and 
(c) isolating the amplified DNA, 

DETAILED DESCRIPTION OF THE INVENTION 
20 In accordance with the present invention, antibiotic biosynthesis genes can be 

recovered from soil and lichens by a method comprising the steps of: 

(a) combining a humic or lichen-derived sample with a pair of 
amplification primers under conditions suitable for polymerase chain reaction amplification, 
wherein the primer set is a degenerate primer set selected to hybridize with conserved regions 

25 of an antibiotic biosynthesis gene; 

(b) cycling the combined sample through a plurality of amplification 
cycles to amplify DNA complementary to the primer set; and 

(c) isolating the amplified DNA. 

As used in the specification and claims of this application, the term "humic or 
30 lichen-derived sample" encompasses any sample containing the DNA found in lichens or in 
samples of humic materials including soil, mud, peat moss, marine sediments, and effluvia 
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from hot springs and thermal vents in accessible fomi for amplification, substantially without 
alteration of the natural ratios of such DNA in the sample. One exemplary form of a humic 
sample is a sample obtained by performing direct lysis as described by Bams et al, Proc, 
Nat'lAcad. ScL USA 91 :1609-16I3 (1994) on a soil sample and then purifying the total DNA 
5 extract by column chromatography. Related extraction methods can be applied to the 

isolation of community DNA from other environmental sources. See, Trevors et al, eds. 
Nucleic Acids in the Environment, Springer Lab Manual (1995). Lichen-derived samples 
may be prepared from foliose lichens by the method of fungal DNA extraction described by 
Miao et al., Mol Gen. Genet. 226: 214-223 (1991). Specific non-limiting procedures for 

10 isolation of DNA from humic and lichen samples are set forth in the examples herein. 

The humic or lichen-derived sample is combined with at least one, and 
optionally with several pairs of amplification primers under conditions suitable for 
polymerase chain reaction amplification. Polymerase chain-reaction (PGR) amplification is a 
well known process. The basic procedure, which is described in US Patent No. 4,683,202 

1 5 and 4,683,1 95, which are incorporated herein by reference, makes uses of two amplification 
primers each of which hybridizes to a different one of the two strands of a DNA duplex. 
Multiple cycles of primer extension using a polymerase enzyme and denaturation are used to 
produce additional copies of the DNA in the region between the two primers. In the present 
invention, PGR amplification can be performed using any suitable polymerase enzyme, 

20 including Taq polymerase and Thermo Sequenase*^*^. 

The amplification primers employed in the method of the invention are degenerate 
primer sets selected to hybridize with conserved regions of known antibiotic biosynthetic 
genes, for example Type I and Type II polyketide synthase genes, isopenicillin N synthase 
genes, and peptide synthetase genes. Each degenerate primer set of the invention includes 

25 multiple primer species which hybridize with one DNA strand, and multiple primer species 
which hybridize with the other DNA strand. All of the primer species within a degenerate 
primer set which bind to the first strand are the same length, and hybridize with the same 
target region of the DNA. These primers all have very similar sequences, but have a few 
bases different in each species to account for the observed variations in the target region. For 

30 this reason, they are called degenerate primers. 
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Similarly, all of the primers within a degenerate primer set which bind to the second strand 
are the same length, hybridize with the same target region of the DNA, and have very similar 
sequences with a few bases different in each species to account for the observed variations in 
the target region. 

5 The degenerate primer sets of the invention are selected to hybridize to highly 

conserved regions of known antibiotic biosynthesis genes in such a way that they flank a 
region of several hundred (e.g. 300) or more base pairs such that amplification leads to the 
selective reproduction of DNA spanning a substantial portion of the antibiotic biosynthesis 
gene. Selection of primer sets can be made based upon published sequences for classes of 

10 antibiotic biosynthesis genes. 

For example, for amplification of Type 1 polyketide synthase genes, we have 
designed primers based upon the conserved sequences of six beta-ketoacyl carrier protein 
synthase domains of the erythromycin gene cluster. Donadio et al ., Science 252: 675-679 
(1991); Donadio and Staver, Gene 126: 147-151 (1993). These primers have the sequences 

1 5 5'-GC(C/G) (A/G)T(G/C) GAG CCG GAG CG CGC-3' [SEQ ID No. 1 ] 

and 

5*-GAT (C/G)(G/A)C GTC CGC (G/A)TT (C/G)GT (C/G)CC-3' [SEQ ID No. 2]. 

The expected size of the PGR product is 1 .2 kilobase pairs. Other degenerate primer sets for 
Type I and Type II polyketide synthetase genes could be determined fi-om sequence 

20 information available in Hutchinson and Fujii, Ann. Rev, Microbiol, 49: 201-238 (1995). 

Type II polyketide synthase gene clusters are characterized by the presence of 
chain length factor genes which are arranged at the 3'-end of the ketosynthase genes. Primers 
were designed based on one conserved region near the 3'-end of the ketosynthase gene and 
one at the middle portion of the chain length factor gene. The sequences of one suitable set 

25 of amplification primers are: 

5' CT(C/G)AC(G/C)(G/T)(C/G)GG(C/G)CGIAC(C/G)GC(C/G)AC(C/G)CG-3'SEQ ID No. 3 
and 

5' GTT(C/G)AC(C/G)GCGTAGAACCA(C/G)GCGAA-3' SEQ ID No. 4 

The expected size of the PGR product was 0.5 kilobase pairs. An alternative set of 
30 degenerate primers has the sequence 

5'-TTCGG(G/G)GGITTGCAG(T/A)(G/G)IGC(C/G)ATG SEQ ID No. 5 
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and 

5'-TC(C/G)A(G/T)(C/G)AG(C/G)GC(C/G)AI(C/G)GA(C/G)TCGTAICC SEQ ID No. 6. 
These primers were designed based upon consensus sequences for the regions flanking the 
Ksp (chain length factor) genes. The consensus sequences are available from Hutchinson and 
5 ¥u}n, supra. 

Primers were designed for beta-lactam biosynthetic genes on the basis of the 
conserved sequences of a number of isopenicillin N synthase genes as described in 
Aharanowitz et al., Ann, Rev. Microbiol. 46: 461-495 (1992). These primers have the 
sequences 

1 0 5'-GG(C/G/T) TC(C/G) GG(C/G) TT(C/T) TTC TAG GC-3' [SEQ ID No. 7] 

and 

5'.CCT (C/G)GG TCT GG(An') A(C/G)A G(C/G)A CG-3* [SEQ ID No. 8]. 

The expected size of the PGR product is 570 base pairs. Other degenerate primer sets could 
be determined from sequence information available in Jensen and Demain, "Beta-Lactams" in 

1 5 Genetics and Biochemistry of Antibiotic Production (L.C. Vining and C, Studdard, eds.), pp 
239-268, Butterworth-Heinemann, Newton, MA (1995). 

For isolation of peptide synthetase genes, primers based on two of the 
conserved core sequences within the ftinctional domains of peptide synthetase genes as 
described by Turgay and Marahiel, Peptide Res. 7: 238-241 (1994) were utilized. These 

20 primers had the sequence 

5'-ATCTACAC(G/C)TC(G/C)GGCAC(G/C)AC(G/C)GGCAAGCC(G/C)AAGGG-3' 

SEQ ID No. 9 

and 

25 5*-A(An')IGAG(T/G)(C/G)ICCICC(G/C)(A/G)(A/G)(G/C)I(A/C)GAAGAA-3' 

SEQ ID No. 10 

The expected size of the PGR product is 1 .2 kilobase pairs. 

PGR amplification can also be used for isolating lichen-derived antibiotic 
biosynthesis genes and gene fragments. For isolation of Type I polyketide synthase genes 
30 from lichens, the primer set used was previously described by Keller et al. in Molec. Appl. to 
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Food Safety Involving Toxic Microorganisms, J.L. Richard, cd., pp. 2630277 (1995), and had 
the following sequences. 

5'-MGIGARGCIYTIGCIATGGAYCClCARCARMG SEQ ID No. 1 1 

and 

5 5'-GGRTCNCCIARYTGIGTICCIGTICCRTGIGC SEQ ID No. 12 

The expected size of the PGR product is approximately 0.7 to 0.9 kilobases. Actual products 
evaluated ranged in size from 637 to 809 nucleotides (not including the 61 nt due to the 
primers). 

Once the primers and the sample are cycled through sufficient thermal cycles 
10 to selectively amplify antibiotic biosynthetic DNA in the sample (generally around 25 cycles 
or more), the amplified DNA is isolated from the amplification mixture. Isolation can be 
accomplished in a variety of ways. For example, the PGR products can be isolated by 
electrophoresis on an agarose or polyacrylamide gel, visualized with a stain such as ethidium 
bromide and then excised from the gel for cloning. Primers modified with an affinity binding 
15 moiety such as biotin may also be used during the amplification step, in which case the 
affinity binding moiety can be used to facihtate the recovery. Thus, in the case of 
biotinylated primers, the amplified DNA can be recovered from the amplification mixture by 
coupling the biotin to a streptavidin-coated solid support, for example Dynal streptavidin- 
coated magnetic beads. 

20 It will be appreciated that the DNA obtained as a result of this isolation will 

not generally be of a single type because of the degeneracy of the primers and the complexity 
of the initial sample. Thus, although these steps are sufficient to recover antibiotic 
biosynthesis genes from soil or lichen, it is preferable to further separate and characterize the 
individual species of amplified DNA. 

25 This further separation and characterization can be accomplished by inserting 

the amplified DNA into an expression vector and cloning in a suitable host. The specific 
combination. of vectors and hosts will be understood by persons skilled in the art, although 
bacterial expression vectors and bacterial hosts are generally preferred. Individual clones 
are then picked and the sequence of the cloned plasmid determined. While random selection 

30 has been employed successfully, selection of antibiotic biosynthesis gene-containing clones 
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can be facilitated by screening using hybridization with DNA probes based on conserved 
sequences or by overlay of bacterial clones with an antibiotic-sensitive test strain. 

Once the sequence of the cloned DNA is determined, it can be screened 
against existing libraries of nucleotide and protein sequences for confirmation as an antibiotic 
biosynthetic gene or gene fragment. Amplified DNA so-identified can be used in several 
ways. First, the amplified DNA, or distinctive portions thereof, can be used to as probes to 
screen libraries constructed from humic-derived or lichen DNA to facilitate the identification 
and isolation of full length antibiotic biosynthetic genes. Once isolated, these genes can be 
expressed in readily cultivated surrogate hosts, such as a Streptomyces species for soil- 
derived genes or an Aspergillus species for lichen-derived genes. General procedures for 
such expression are known 

in the art, for example from Fujii et al., Molec, Gen, Genet. 253: 1010 (1996) and Bedford et 
al, J, Bacteriol. 177: 4544-4548 (1995), which are incorporated herein by reference. 
Second, amplified DNA which is different from previously known DNA can be used to 
generate hybrid antibiotic biosynthesis genes using the procedures described by McDaniel et 
al, Nature 375: 549-554 (1995); Stachelhaus et al., Science 269: 69-72 (1995); and 
Stachelhaus et al, Biochem, Pharmacol 52: 177-186 (1996). In these procedures, the novel 
DNA sequences isolated using the method of the invention are spliced into a known antibiotic 
gene to provide an expressible sequence encoding a complete gene product. 

Using the method of the invention, a number of unique nucleotide sequences 
have been identified and characterized. The sequences and the biosynthetic 
polypeptides/proteins for which they encode, given by sequence ID Nos. 13 to 80, are a 
further aspect of the present invention. 

EXAMPLE 1 

Total DNA was extracted from soil samples by a direct lysis procedure as 
described by Bams et al. (1994). The high molecular weight DNA (>20 kb) in the extract 
was separated on a Sephadex G200 column (Pharmacia, Uppsala, Sweden) as described by 
Tsai and Olson, Appl Environ. MicrobioL 58: 2292-2295 (1992), 

The DNA extract (10-50 ng template DNA) was added to an amplification 
mixture (total volume 100 \x\) containing 20 mM Tris-HCl (pH 8.4), 50 mM KCl, 2 mM 
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MgClj, 200 |iM of each deoxynucleotide triphosphate, 25 pmol of each Type I polyketide 
primer (Seq ID Nos 1 and 2) and 5.0 units of Taq polymerase (BRL Life Technologies, 
Gaithersburg, MD). The mixture was then thermally cycled for 30 cycles in a MJ Research 
PTC- 100 thermocycler using the following program: 
5 denaturation 93 °C 60 seconds 
annealing 60**C 30 seconds 
extension 72**C 90 seconds 

The PCR products were then electrophoresed in 1% agarose gels and stained 
with ethidium bromide to visuaUze the DNA bands. Bands containing PCR product of the 

10 expected size were excised from the gel and purified using a Qiaex Gel Extraction kit (Qiagen 
GmBH). The purified DNA was ligated to pCRII (Invitrogen) to generate a clone library 
using E, coll INVaF competent cells. 18 clones were chosen at random from the library and 
sequenced using a Taq Dye Terminator Cycle Sequencing Kit and an Applied Biosystem 
DNA sequencer model 373. The sequencing primers used included the universal M13 (-20) 

15 forward primer, the M13 reverse primer and primers designed from the sequence data 

obtained. DNA sequences were translated into partial amino acid sequences using a software 
package from Geneworks (Intelligenetics, Inc.) with ftirther manual adjustments and sent to 
the NCBI database by e-mail at blast@ncbi.nhn.nih.gov for comparison against protein 
databases. Altschul et al., "Basic Local Alignment Tool", J. Mol Biol 215: 403-410 (1990). 

20 Blast analysis of the 18 clones pointed to 12 unique sequences that were not 

identical to each other or to published sequences. Seq. ID No. 13 shows the complete DNA 
sequence of a representative unique clone (Clone ksfs). Seq. ID No. 14 shows the translated 
amino acid sequence of this clone. The greatest homology as determined by a Blast analysis 
is indicated to be Type I polyketide synthases. Similar results were obtained on the Blast 

25 search of the other 1 1 unique clones based upon partial sequences which were determined. 

EXAMPLE 2 

The experiment of Example 1 was repeated using isopenicillin N synthase 
gene primers (Seq ID Nos. 7 and 8). The thermal cycling program was changed to include 60 
30 second extension periods at 72 °C, but otherwise the experimental conditions were the same. 
Twelve clones were picked at random and yielded one unique sequence that was not identical 
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10 published sequences. The complete sequence of this clone (Clone ipnsfs) is shown in Seq. 
ID. No. 15 and the translated amino acid sequence in Seq. ID No. 16. The BLAST search 
indicated greatest homology for this sequence with isopenicillin N synthases. 

EXAMPLES 

The experiment of Example 1 was repeated using peptide synthetase primers 
(Seq. ID Nos 9 and 10). The amplification mixture was changed to a 50 ul volume containing 
10 to 50 ng of template DNA, 20 mM 0^4)2804, 74 mM Tris-HCl (pH 8.8), 1.5 mM MgC^, 
0.01% Tween 20, 200 ^iM of each deoxynucleotide triphosphate, 25 pmol of each primer, 
0.25 % skim milk and 0.4 units of Ultra Therm DNA Polymerase (Bio/Can Scientific, 
Mississauga, Ontario). The mixture was thermocycled for 30 cycles using the following 
program: 

denaturation 95 ""C 60 seconds 
annealing 52 °C 60 seconds 
extension 72 °C 120 seconds. 

Thirty clones containing a 1.2 kb insert have been partially sequenced. The 
BLAST analysis of the 30 clones pointed to 28 unique sequences that were not identical to 
each other or to published sequences. Varying degrees of homology to known peptide 
synthase genes were seen. Seq. ID No. 17 shows the complete DNA sequence of 
representative clone (ps32). Seq. ID No. 18 shows the translated amino acid sequence of this 
clone. Based on a Blast search of these sequences, the greatest homology is to a peptide 
synthase gene such as the pristinamycin synthase gene from Streptomyces pristinaespiralis 
and Bacillus sp. peptide synthetase genes such as gramicidin S synthase and surfactin 
synthetase. Stachelhaus and Marahiel, FEMS Micro. Letters 125: 3-14 (1995); Turgay et al., 
Mol Micro 6: 529-546 (1992). 

Sequence ID Nos. 81 to 94 show an additional 7 unique sequences (nucleic 
acid and translated amino acid sequences) of 1.2 kb PGR products amplified from soil DNA 
samples using these primers. These sequences have been named ps 2, ps 3, ps 7, ps 10, ps 24, 
ps 25 and ps 30. The sequences are unique in that they are all different from each other and 
from ps 32, 
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and while they show greatest homology to peptide synthetase sequences in the databases 
searched by BLAST analysis, they do not match any known sequence. Within each, the 
conserved motifs (TGD, KIRGXRIEL, NGK) common to peptide synthetase domains as 
described by Turgay and Marahiel (1994) can be identified. Descriptive information of the 
clones follows: 

Clone ps 2, 1204 bp, with conserved motifs SGD, KIRGFRIEL, NGK, 67% G + C 
Clone ps 3, 1 178 bp, with conserved motifs TGD, KIRGSRIEL, NGK, 59 % G + C 
Clone ps 7, 1222 bp with conserved motifs TGD, KIRGYRIEL, NGK, 55.5 % G + C 
Clone ps 10, 1171 bp with conserved motifs TGD, KIRGHRIEL, NLK, 63% G + C 
Clone ps 24, 1 190 bp with conserved motifs TGD, KIRGHRIAM, NQK, 56 % G + C 
Clone ps 25, 1 178 bp with conserved motifs TGD, KLRGYRIEL, NDK 68 % G + C 
Clone ps 30, 1200 bp with conserved motifs TGD, KVRGFRIEP, NGK, 64.5 % G + C 
Clone ps 32, 1 172 bp with conserved motifs TGD, KIRGFRIEL, SGK, 67 % G + C 

EXAMPLE 4 

The experiment of example 1 was repeated using the Type II polyketide 
synthase primers given by Seq. ID. Nos. 3 and 4. PCR amplification was carried out in a 
total volume of 50 ul containing 50 ng of soil DNA, 20 mM Tris-HCl (pH 8.4). 50 mM KCl , 
2 mM MgCl2, 200 uM of each deoxynucleotide triphosphate, 25 pmol of each primer and 5.0 
units of Tag polymerase (BRL Life Technologies, Gaithersburg, MD). The thermal cycling 
conditions included denaturations at 94°C for 60 seconds, annealing at 58°C for 30 seconds 
and extensions at 72 °C for seconds, repeated for a total of 30 cycles. 

PCR amplification yielded products of the expected size of 0.5 kilobase pairs. 
Sequencing of 18 randomly selected clones revealed the presence of 5 unique sequence that 
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were not identical to each other or to published sequences. Seq. ID No. 19 shows the 
complete DNA sequence of a representative clone (clone elf). The translated amino acid 
sequence of this clone is shown in Seq. ID. No. 20. In a BLAST search of this DNA 
sequence against the protein database, the greatest homology is indicated to chain length 
5 factor genes of the Type II polyketide synthases. 

Example 5 

The experiment of Example 1 was repeated using the Type I polyketide 
synthase primers designed for fungal sequences. (Seq. ID. Nos. 1 1 and 12) PGR 

10 amplifications were carried out with lichen DNA samples from a variety of lichen species 
representing 1 1 genera prepared as described in Miao et al. (1991), supra. 

PGR amplifications were carried out in a total volume of 50 ul containing 
approximately 10 ng of lichen DNA and 1 unit of Tag polymerase in a reaction as per 
Example 4. The cycling protocol was 30 cycles of denaturation at 95 °G for 60 seconds, 

15 annealing at 57°G for 2 minutes and extensions at 72°G for 2 minutes. 

Forty seven clones with inserts of the expected size have been partially 
sequenced. The sequences all show homology to Type I fungal polyketide synthase genes but 
are all distinct firom each other and from known sequences. Seq. ID. No. 21 shows the com- 
plete DNA sequence of a 637 base pair product amplified from DNA extracted from the 

20 lichen Xanihoparmelia cumberlandia (clone Xa.cum.6A). The translated amino acid seq- 
uence is shown in Seq. ID. No. 22. The greatest homology as determined by Blast analysis is 
indicated to fungal Type I polyketide synthase genes. Sequence ID Nos. 29 and 30 show the 
DNA sequence and conceptual amino acid sequence, respectively, for a further clone 
Xa.cum.6H isolated in this experiment. Sequences of DNA and the corresponding amino 

25 acid sequences for seven other lichen samples, Leptogium corniculatum (Seq. ID Nos. 31-42), 
Parmelia sulcata (Seq. ID Nos. 43-50); Peltigera neopolydactyla (Seq. ID Nos. 51-60); 
Pseudocyphellaria anthrapsis (Seq. ID Nos. 61-62); Siphula ceratities (Seq. ID. Nos. 63-66); 
Thamnolia vermicularis (Seq. ID Nos. 67-68); and Usnea florida (Seq. ID Nos. 69-80). Each 
of these sequences showed homology by Blast analysis to fiingal Type I polyketide synthase. 



30 
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EXAMPLE 6 

The experiment of Example 5 was repeated on DNA from the Uchen Solorina 
crocea using the degenerate peptide synthetase primers of Example 3. Freshly collected 
lichen (approximately 1.2 g) was washed in running tap water to remove conspicuous soil and 
field detritis, and then further cleaned under a dissecting microscope. The cleaned sample 
was then gently shaken in a 50 ml tube containing about 40 ml of 0.2% SDS for at least 30 
minutes and rinsed thoroughly with water. Excess surface water was blotted from the 
washed, hydrated lichen, and the sample was frozen at -80°C for at least 15 minutes then 
vacuum dried at room temperature for 4 hours. The lichen was ground in liquid nitrogen 
using a mortar and pestle to produce a lichen powder for use in preparing DNA extracts. 

To prepare the DNA extracts, 0.28g of lichen powder was placed into 18 2-mi 
microfuge tubes, and each aliquot was mixed with 1.25 ml isolation buffer (150 mM EDTA, 
50 mM Tris pH 8, 1% sodium lauroyl sarcosine) and extracted for 1 hour at 62°C. The 
samples were centrifuged for three minutes to pellet cellular debris and a cloudy supernatant 
was decanted into new microfuge tubes. Each sample of the supemate was mixed with 750 
\i\ 7.5 M ammonium acetate, incubated on ice for 30 minutes and centrifuged for five minutes 
at 16,000 X g to precipitate proteins. The supernatant fluid was saved in new microfuge 
tubes and nucleic acids were precipitated with 0.6 volumes of isopropanol ovemight at 4°C. 
Samples were centrifuged for five minutes at 16,000 X g to pellet nucleic acids. The pellets 
were dissolved in TE containing RNAse (18 ng total) at 50*'C for 45 minutes. The solutions 
were then extracted with an equal volume of TE saturated phenolxhloroform (1:1), and again 
with chloroform. DNA in the aqueous phase was precipitated with 0.1 M sodium acetate and 
two volumes of ethanol at -20°C for 2 hours, and then pelleted by centrifiigation for five 
minutes at 16,000 X g. The DNA pellet was washed with 75% ethanol, vacuum dried at 
room temperature for 3 minutes and then dissolved in TE. The final amount of DNA 
recovered was approximately 70ng according to fluorometric measurement. 

Two clones containing the expected 1.2 kb insert were sequenced and found to 
contain the same sequence shown in Seq. ID. No. 23. Seq. ID. No. 24 shows the translated 
amino acid sequence. The sequence is distinct, with greatest homology as determined by 
Blast analysis to the peptide synthase module of the cyanobacterium Microcyctis aeruginosa. 
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EXAMPLE 7 

The experiment of example 4 was repeated using the Type II polyketide 
synthase primers given by Seq. ID. Nos. 5 and 6. Three starting samples were used for 
recovery of Type II polyketide synthase genes: two uncharacterized strains Sireptomyces 
(strains WEC 68 A and WEC 7 IB) which had been shown to contain Type II polyketide 
synthase genes, and a soil sample obtained from a forest area near Vancouver, British 
Columbia. The soil sample was prepared using the basic protocol from Holben et al, Appl 
Environ. Microbiol 54: 703- 71 1 (1988) with variations in parameters such as mix time to 
adjust for the individual characteristics of the soil samples. 

Streptomyces genomic DNA preparations suitable for PCR amplification were 
prepared from the mycelia harvested from a 50 ml culture in tryptic soy broth (Difco) which 
had been grown for 3 days at 300 C. The mycelia were collected by centrifugation at 2500 x 
g for 10 minutes, the pellets were washed in 10% v/v glycerol and the washed pellets were 
frozen at -200C. The size of the pellets will vary with different strains; for extraction, 1 g 
samples were suspended in 5 ml TE buffer (10 mM Tris-HCl, pH 8.0, 1 mM EDTA) in a 50 
ml screw cap Oakridge tube and lysozyme (to 10 mg/ml) and RNase (to 40 ug/g) were added. 
Following incubation at 300C for 45 min. a drop of each suspension was transferred to a 
microscope slide, one drop of 10% SDS was added and the suspension was checked for 
complete clearing and increased viscosity, indicating lysis. Most strains lyse with this 
incubation time, but incubation in lysozyme may be continued if necessary. (For strains 
which are very resistant to lysis, small amounts of DNA suitable for PCR amplification may 
often he prepared on a FastPrep'^w instrument as described below.) Following confirmation 
of sufficient incubation time in lysozyme, 1.2 ml of 0.5 M EDTA, pH 8.0 was added to the 
suspension and mixed gently then 0.13 ml of 10 mg/ml Proteinase K (Gibco/BRL) solution 
was added and incubated for 5 min. at 300 C. 0.7 ml of 10% SDS was added, mixed gently 
by tilting, then incubated again at 300 C for 2 hours. Following lysis, three successive 
phenol/chloroform extractions were performed by adding a volume equivalent to the aqueous 
phase each time of a 1:1 mixture of ultrapure Tris buffer saturated phenol (Gibco/BRL) and 
chloroform. The aqueous phase was recovered each time following centrifugation at 2500 x 
g for 1 0 min. in a shortened (i.e.wide bore) Pasteur pipet to minimize shearing; DNA was 
precipitated from the final aqueous phase with the addition of 0.1 volume of 3M Na acetate, 
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pH 4.8 and I volume of isopropanol at room temperature. DNA was spooled from the 
solution onto a sealed Pasteur pipet, rinsed in ice cold 70% ethanol and solubilized in 0.5 ml 
TE buffer overnight at room temperature. DNA yields (as determined spectrophotomet- 
rically) typically range from 1 to 3 mg from 1 g of mycelia. 

5 An alternative method for the preparation of small amounts of Streptomyces 

DNA suitable for PGR amplification has been found to be useful for strains resistant to lysis 
or when a faster method is desirable. This method makes use of the FastPrep^*^ instrument 
(Savant) and the methods and kit supplied by BIO 101 (Bio/Can Scientific, Mississauga, 
Canada). A 2 ml aliquot from a 20 ml, 3 day culture in tryptic soy broth is pelleted in a 2 ml 

10 micro flige tube and the size of the mycelial pellet is estimated. "Small" pellets are 

resuspended in 100 ul of sterile distilled water; larger pellets are resuspended in 200-300 ul of 
water. 200 ul of suspension is transferred to a homogenization tube from the kit , Following 
the manufacturer's protocol for the preparation of DNA from medium hard tissue, the large 
bead is added to this tube (which already contains a small bead) and 1 ml of solution CLS-TC 

15 from the kit is added and the samples are processed in the instrument for 10 seconds at speed 
setting 4.5. Samples are then spun 15 min. at 10,000 x g at 40C and 600 ul of the supernatant 
is transfened to a clean microfiige tube, 400 ul of Binding Matrix is added and mixed gently, 
then the sample is spun for 1 min. as above. The supernatant is discarded while the pellet is 
resuspended in 500 ul SEWS-M and transferred to a SPIN^^ Filter unit. This is spun for 1 

20 minute, the contents of the catch tube are discarded and the unit is spun again to dry. The 

filter unit is transferred to a new microfiige tube and DNA is eluted from the matrix in 100 ul 
DES which is left on the filter for 2-3 min. at room temperature. Eluted DNA is collected by 
spinning once again and this DNA is now ready to use in PCR amplifications. Due to 
components of the final solution, DNA prepared by this method is difficult to quantify. 

25 Typically 1 ul or 1/10 ul of this eluate is suitable as a template for PCR; 
larger quantities may be inhibitory to the PCR polymerase. 

PCR amplification was carried out in a total volume of 50 ul containing 50 ng 
of DNA, 5 % DMSO, 1.25 mM MgClj, 200 uM of each deoxynucleotide triphosphate, 0.5 ug 
of each primer and 5.0 units of Taq polymerase (BRL Life Technologies, Gaithersburg, MD). 

30 The thermal cycling started with a *touch-down' sequence, lowering the annealing tempera- 
ture from 65°C to 58°C over the course of 8 cycles. The temperature of the annealing step 
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was then maintained at 58 °C for a further 35 cycles. The overall cycle used was: denatura- 
tion at 94°C for 45 seconds, annealing at 65 °C to 58°C for 1 minute and extension at IVC 
for 2 minutes. The size of the amplified fragments was expected to be approximately 1 .5 kb. 

Amplification of the two Streptomyces strains produced DNA fragments of the 

5 expected size (1482 bp and 1538 bp). Open reading frame analysis of the two sequences 

revealed the presence of a set of three ORFs each, corresponding to the 3*-ends of the putative 
Ks^-subunit genes (50 to 60 bp), possible full-length Ksp genes (approx. 1 .2 kb) and the first 
halves of potential ACP genes (approx 100 bp). In each sequence, the first and second ORFs 
were linked by a stop codon overlap typical of Ks^p gene pair junctions and a possible 

1 0 indication of tight coexpression through translational coupling. The two Ksp genes were 
separated from the downstream ACP genes by a short spacer, again consistent with the 
expected gene organization. 

Two clones were selected from among clones created using the soil DNA as a 
source which were found to produce 1.5 kb inserts. These inserts were sequenced and found 

15 to exhibit similarity to known KSp genes with three ORFs as described above. The translated 
amino acid sequences of the four genes are shown in Sequence ID Nos 25 to 28. 

The four putative KSp genes had G+C content over 70% which is typical for 
the coding regions of Actinomycete genes. Results of data base searches established that the 
deduced products of all four ORFs were similar to known KSp gene products from Type II 

20 polyketide synthases but they did not match any known sequences. 

EXAMPLE 8 

DNA can be extracted from large volumes of soil in accordance with the 
following procedure. Place dry soil into a sterile blender with 0.2% sodium pyrophosphate 

25 (100 ml/100 grams of soil). The pH of the sodium pyrophosphate solution should be about 
10, although some variation to account for the characteristics of the soil may be appropriate. 
The mixture is blended for 30 seconds, decanted into centrifuges bottles and then centrifliged 
for 15 minutes at 100 X g at 4 °C. The supernatant is decanted, filtered two times through 
cheese cloth and saved. The pelleted soil is extracted an additional two times using the same 

30 procedure. 
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After the extractions, the pooled supematants arc centrifuged for 15 minutes at 
10,500 X g and the pellets are collected. The pellet may be incubated for 6 hours at 55 °C in 
pre-germination medium (0.5% w/v yeast extract (Difco), 0.5% w/v casamino acids (Difco) 
with 0,005 M CaClj and 0.025 M TES, pH 8.0 (added separately from sterile stock after 
autoclaving other components)) and then repelleted, or it may be used directly. In either case, 
the pellet (approximately 30-200 mg) is mixed with 5 ml IX TE (pH 8.0), 500 ^il 0.5M 
EDTA (pH 8.0) and 500 ^il - 20 mg/ml lysozyme in IX TE (pH 8.0) and incubated for 30 
minutes at 37°C. 500 ^1 of 20% SDS and 100 ^il - 1% proteinase K in TE and 1% SDS are 
then added and the mixture is vortexed gently before incubating for 60 minutes at 55 °C or 
overnight at 37°C. 

The incubated mixture is combined with 10 ml 20% polyvinylpyrrolidone 
(avg. MW=40,000) and incubated for 10 minutes at 70 ''C. One-half volume of 7.5 M 
ammonium acetate (stored at -20°C) is then added, the resulting mixture is placed for 10 
minutes on a low speed shaker, and then centrifuged for 20 minutes at 18,5000 X g. The 
supernatant is combined with 1 volume of isopropanol and incubated for 30 minutes at -20 °C 
before centrifiiging for 20 minutes at 18,500 X g. The pellet from this centrifugation is 
washed in 70% ethanol, and centrifuged for 10 minutes at 18,500 X g. The pellet from this 
final centrifugation is collected and air dried. 

EXAMPLE 9 

To extract DNA from small amounts of soil the following procedure can be 
used. Combine soil (approx 1 g) with 1 ml distilled water, vortex to suspend and pellet at 
19,000 X g for 5 minutes. After removing the supernatant, freeze/thaw the samples twice by 
either of the following techniques (a) -20°C freezer, 30 minutes, followed by 50-60°C water 
bath (2 minutes), repeated 2 times; or (b) quick freeze in EtOH-dry ice bath (dip in until 
frozen, approx one minute) followed by 60*^0 water bath (2 minutes), repeated 2 times. The 
pellets are then suspended in 350 ^il TE buffer (pH 8.0), 50 nl 0.5 M EDTA and 50 ^1-20 
mg/ml lysozyme in TE buffer, vortexed and incubated at 37 °C for 30 minutes in a water bath. 
50 ^1 of 20% SDS and 10 ^1 1% Proteinase KJ 1% SDS in TE buffer is added, vortexed, and 
incubated for one hour at 55°C or ovemight at 37°C. One-tenth volume of 20% 
polyvinylpyrrolidone (avg. MW=40,000) is then added and incubated at 70 °C for 10 minutes. 
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One-half volume of 7.5 M ammonium acetate (stored at -20°C) is added, the tubes are shaken 
at low speed for ten minutes and then centrifuged at 19,000 X g for 20 minutes. The 
supernatant is collected using pipets with cut tips to avoid shearing DNA, combined with one 
volume of isopropanol, mixed gently, and stored at -20°C for 30 minutes or 4*^0 oveniight. 
5 The DNA is then collected as a pellet by centrifugation at 19,000 X g for 10 minutes. The 
resulting pellet is washed with 0.5 ml of 70% ethanol (stored at -20°C) and then air or 
vacuum dried. The dried DNA is then dissolved in 50-150 ul of TE buffer, incubated at 4°C 
for one hour and then heated to eO^'C for 10 minutes to facilitate dissolving DNA. The 
resulting solutions are stored at -20°C until use. 
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SEQUENCE LISTING 

(1) GENERAL INF0R^4ATI0N : 

(i) APPLICANT: Terragen Diversity Inc, 

(ii) TITLE OF INVENTION: METHOD FOR ISOLATION OF BIOSYNTHESIS 
GENES FOR BIOACTIVE MOLECULES 

(iii) NUMBER OF SEQUENCES: 94 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Deeth Williams Wall 

(B) STREET: National Bank Building, 150 York Street, Suite 400 

(C) CITY: Toronto 

(D) STATE: Ontario 

(E) COUNTRY: Canada 

(F) ZIP: M5H 3S5 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 3.5 inch, 1.44 Mb 

(B) COMPUTER: Dell (IBM Compatible) 

(C) OPERATING SYSTEM: Windows 95 

(D) - SOFTWARE: Word 97 - - ^ 

(vi) CURRENT APPLICATION DATA : 

(A) APPLICATION NUMBER: Not yet assigned 

(B) FILING DATE: May 21, 1998 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/861,774 

(B) FILING DATE: May 22, 1997 

(viii) ATTORNEY/AGENT INFORMATION : 

(A) NAME: Eileen McMahon 

(B) REGISTRATION NUMBER: 

(C) REFERENCE/DOCKET NUMBER: 1694/0005 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 416-941-9440 

(B) TELEFAX: 416-941-9443 

(C) TELEX: 

(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: yes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 
GCSRTSGACC CGCAGCGCGC 20 

(2) INFORMATION FOR SEQ ID NO: 2: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 
GATSRCGTCC GCRTTSGTSC C 21 

(2) INFORMATION FOR SEQ ID NO : 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: yes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
CTSACSKSGG SCGNACSGCS ACSCG 25 

(2) INFORMATION FOR SEQ ID N0:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
GTTSACSGCG TAGAACCASG CGAA 25 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: yes 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:5: 
TTCGGSGGNT TCCAGWSNGC SATG 24 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 
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(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6: 
TCSAKSAGSG CSANSGASTC GTANCC 2 6 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: yes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 
GGBTCSGGST TYTTCTACGC 20 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 20 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
CCTSGGTCTG GWASAGSACG 20 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: yes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
ATCTACACST CSGGCACSAC SGGCAAGCCS AAGGG 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: 



wo 98/53097 



PCT/CA98/00488 



-21 - 

AWNGAGKSNC CICCSRRSNM GAAGAA 2 6 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 

(B) TYPE: nucleic acid 

( C ) STRANDEDNES S : s ing 1 e 
{ D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: yes 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:11: 
MGIGARGCIY T I GC I ATGGA YCCICARCAR MG 32 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
GGRTCNCCIA RYTGIGTICC IGTICCRTGI GC 32 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1206 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

GCGGTGGACC CGCAGCAGCG CCTCATGCTG GAGCTGGCCT GGTCCGCGCT 50 

GGAAAGCGCA GGTCATCCGC CCTCGATATT CCCCGGCCTG ATCGGGGTCT 100 

ATGTCGGCAT GAACTGGAAT CGCTATCGCG CGAATTGCAT TTCTGCACAC 150 

CCTGATGTGG TGGAGCGATT CGGTGAATTG AACACAGCGC TCGCCAACGA 200 

ATACGACTTT CTTGCTACCC GAATCTCCTA CAAGCTCAAT CTGCGCGGTC 250 

CCAGCGTCAC TATCIAGCACC GCTTGTTCGA CTTCCCTGGT TGCCATTGCT 300 

CAGGCTTCGC AGGCGTTGCT CAACTATGAA TGCGACATTG CTTTGGCTGG 350 

GGTTGCCTCC ATAACCGTGC CTGTCAATGC AGGCTACCTC TACCAAGAAA 400 

GGTGGCATGC TTTCACCGAA GGGCATTGTC CTACATTCGA TGCCCCAGCA 450 

CGGGACCACT TCAATGATGC CCCCTGTCTC CTTTTTGCGG GCCTGGAAAA 500 

CCCATCCAGG AGGGGGGGGG GGGCCCTCAT ACCCGGCCTT TCAAGCGGGA 550 

ACCTCTCACA GGAAGCGGAT GTTTCAGCCG AAGGGATGTT GAACATTGAC 600 

GCCGGCAGCA CGGGGGACAA GTTCAGGGAT GGGCGCGCTT TTGTTGTATG 650 

GGGGGGGCCT GGAAGAAGCA TTCAAGGGAC GGTGATCAAA CTTAACCCCT 70 0 

TCATTGGCGG GTTTGCCGCG GAACAAGGAC GGGTTCGGAC AAGGCGAGTT 750 

TACCGGCGCC CAGGCGTCAA TGGTCAGGGC GGAGTTCATT TCGCTTTGGC 800 
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GGTGGAGTTT GCGGGATATT CGAATCCCGC AAGCATCGGG ATTTCATTCG 850 

AAAACCCACG GGCACGGGCG ACGCCATTGG GCGATCCGAT AGAAGTGGCC 900 

GCGCTAAAGA TGGTTTTTCG CCGACGCTCG TTCCAGAGGC GCCGTTGCGC 950 

CCTTGGATCG GTCAAGAGTT GTGTCGGACA CCTGGTTCAC GCCGCCGGCG 1000 

TGACCGGATT TATCAAGGCT GTCTTGTCGG TCTACCACGG CAAGATCGCA 1050 

CCGACACTGT TTTTCGAGAA AGCAAATCCG AGGCTCGGGC TGGAAGACAG 1100 

TCCTTTCTAT GTCAATGCCG GACTCGAGAA GTGGACGGCC GCCGAGCAGC 1150 

CACGCCGCGC GGGGGTCAGT GCTTTCGGGG TCGGTGGCAC CAATGCGCAC 1200 

GCGATC 1206 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 402 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:14: 
Ala Val Asp Pro Gin Gin Arg Leu Met Leu Glu Leu Ala Trp Ser 

5 10 .15 

Ala Leu Glu Ser Ala Gly His Pro Pro Ser lie Phe Pro Gly Leu 

20 25 30 

lie Gly Val Tyr Val Gly Met Asn Trp Asn Arg Tyr Arg Ala Asn 

35 40 45 

Cys lie Ser Ala His Pro Asp Val Val Glu Arg Phe Gly Glu Leu 

50 55 60 

Asn Thr Ala Leu Ala Asn Glu Tyr Asp Phe Leu Ala Thr Arg lie 

65 70 75 

Ser Tyr Lys Leu Asn Leu Arg Gly Pro Ser Val Thr lie Ser Thr 

80 85 90 

Ala Cys Ser Thr Ser Leu Val Ala He Ala Gin Ala Ser Gin Ala 

95 100 105 

Leu Leu Asn Tyr Glu Cys Asp He Ala Leu Ala Gly Val Ala Ser 
110 115 120 

He Thr Val Pro Val Asn Ala Gly Tyr Leu Tyr Gin Glu Arg Trp 
125 130 135 

His Ala Phe Thr Glu Gly His Cys Pro Thr Phe Asp Ala Pro Ala 
140 145 150 

Arg Asp His Phe Asn Asp Ala Pro Cys Leu Leu Phe Ala Gly Leu 
155 160 165 



Glu Asn Pro Ser Arg Arg Gly Gly Gly Ala Leu He Pro Gly Leu 
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170 



175 



180 



Ser Ser Gly Asn Leu Ser Gin Glu Ala Asp Val Ser Ala Glu Gly 
185 190 195 

Met Leu Asn lie Asp Ala Gly Ser Thr Gly Asp Lys Phe Arg Asp 
200 205 210 

Gly Arg Ala Phe Val Val Trp Gly Gly Pro Gly Arg Ser lie Gin 
215 220 225 

Gly Thr Val lie Lys Leu Asn Pro Phe lie Gly Gly Phe Ala Ala 
230 235 240 

Glu Gin Gly Arg Val Arg Thr Arg Arg Val Tyr Arg Arg Pro Gly 
245 250 255 

Val Asn Gly Gin Gly Gly Val His Phe Ala Leu Ala Val Glu Phe 
260 265 270 

Ala Gly Tyr Ser Asn Pro Ala Ser lie Gly lie Ser Phe Glu Asn 
275 280 285 

Pro Arg Ala Arg Ala Thr Pro Leu Gly Asp Pro lie Glu Val Ala 
290 295 300 

Ala Leu Lys Met Val Phe Arg Arg Arg Ser Phe Gin Arg Arg Arg 
305 310 315 

Cys Ala Leu Gly Ser Val Lys Ser Cys Val Gly His Leu Val His 
320 325 330 

Ala Ala Gly Val Thr Gly Phe He Lys Ala Val Leu Ser Val Tyr 
335 340 345 

His Gly Lys He Ala Pro Thr Leu Phe Phe Glu Lys Ala Asn Pro 
350 355 360 

Arg Leu Gly Leu Glu Asp Ser Pro Phe Tyr Val Asn Ala Gly Leu 
365 370 375 

Glu Lys Trp Thr Ala Ala Glu Gin Pro Arg Arg Ala Gly Val Ser 
380 385 390 

Ala Phe Gly Val Gly Gly Thr Asn Ala His Ala He 



(2) INFORMATION FOR SEQ ID NO: 15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 565 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 



395 



400 
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(iv) ANTI-SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

GGCTCCGGGT TTTTCTACGC GTCCAACCAC GGGATCGACG TCACGCGGGT 50 

GCGCGACGAG GTGAACAAGT TCCACGCCGA GATGACGCCC GGGGAGAAGT 100 

TCGAGCTGGC CATCAACGCC TACAACGACG CGAATCCGCA TACCCGCAAC 150 

GGGTATTACA TGGCCGTCGA AGGCAAGAAG GCCGTCGAGT CCTTCTGCTA 200 

CCTCAACCCG GCCTTCACCC CCGAGCACCC GATGATCGAG GCGGGCGCGG 250 

CGGGGCACGA GGTGAACAAC TGGCCGGACG AGGCTCGCCA CCCCGGCTTC 3 00 

CGTGAGTACG GGGGAGCAGT ACTTCGAAGA GGATCCTCCG ACCTGTCACT 350 

GGTGCTGCTG CGTGGGTACG CGCTGGCCCT GGGCAAGGAC GAGAACTACT 4 00 

TCGACGACTA CGTCAAGCAC TCCGACACGC TCTCGGCCGT CTCGCTGATC 450 

CGTTACCCGT ACCTGGAGAA CTACCCGCCG GTGAAGACCG GTCCGGACGG 500 

CGAGAAGCTC AGCTTCGAGG ATCACTTCGA CGTCTCGCTG ATCACCGTGC 550 

TCTTCCAGAC CCAGG 565 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 188 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Gly Ser Gly Phe Phe Tyr Ala Ser Asn His Gly lie Asp Val Thr 

5 10 15 

Arg Val Arg Asp Glu Val Asn Lys Phe His Ala Glu Met Thr Pro 

20 25 30 

Gly Glu Lys Phe Glu Leu Ala lie Asn Ala Tyr Asn Asp Ala Asn 

35 40 45 

Pro His Thr Arg Asn Gly Tyr Tyr Met Ala Val Glu Gly Lys Lys 

50 55 60 

Ala Val Glu Ser Phe Cys Tyr Leu Asn Pro Ala Phe Thr Pro Glu 

65 70 75 

His Pro Met He Glu Ala Gly Ala Ala Gly His Glu Val Asn Asn 

80 85 90 

Trp Pro Asp Glu Ala Arg His Pro Gly Phe Arg Glu Tyr Gly Gly 

95 100 105 

Ala Val Leu Arg Arg Gly Ser Ser Asp Leu Ser Leu Val Leu Leu 
110 115 120 

Arg Gly Tyr Ala Leu Ala Leu Gly Lys Asp Glu Asn Tyr Phe Asp 
125 130 135 

Asp Tyr Val Lys His Ser Asp Thr Leu Ser Ala Val Ser Leu He 
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140 145 150 

Arg Tyr Pro Tyr Leu Glu Asn Tyr Pro Pro Val Lys Thr Gly Pro 
155 160 165 

Asp Gly Glu Lys Leu Ser Phe Glu Asp His Phe Asp Val Ser Leu 
170 175 180 

He Thr Val Leu Phe Gin Thr Gin 
185 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1172 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL; no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



AAGGAGGGGC CGCCCGGGGC GAAGAAGCTG 
CACTCCGAGG AGCCCGGACC AGATGCGCGC 
TAGATGGCGG GTCGTAGTCA GTGCGATCCG 
GGCAGCACCT TCAGATCGAT CTTGCCGCTC 
GAGCTCGACG AATGCAGCCG GAATCATGTA 
GATGATCGCG CAGCTCGGAC GCGGCGACCG 
TACGCAACGA GACGCTTGTC GCCGGCCCGC 
GGCCGTCTCG ACACCGGGGT GATCGGCCAG 
GCTCGATGCG GAAGCCGCGG ATCTTGACCT 
AAGTCGAGGT TGCCGTCCGG AAGCCAGCGC 
CAGCCGCGAG CCAGGTGCAC CGAATGGATC 
TGAGGGCGGC ATCATCGACA TAGCCGCGCG 
TACAGCTCGC CGATCACGCG CGCCGGAACG 
CACGTAGACC TGAACGTTGT CGAGCGGACG 
ACCCGTGTTC GGACGCGGGC GACACGATCG 
TTCTCCGTCG GGCCGTACTC GTTGAGCATG 
TGGACGCCGC GTGAGTCGAT CACCGCCCGT 
GTGGAAAGTC GCCAGCCGCG AGCAACGCGT 
TCGGAGATCG TGATCCCCCA TCGCGTCAGG 
ATCGAGGCGG AGCTCGTTGT CCACCAGATG 
GCGTGGACCA CAGCTCGAGC GCCGCGGCAT 
TGCGTCACGC GGTCGTCGGC ACTGATCTCG 
GATCAAATTT CTCAGTGCAC GGTGCGGCAC 
CCGTCGTGCC CGACGTGTAG AT 



TCCGTCCGAC TGACACGTTC 50 

CAGCTTTACC TCGACCGGCG 100 

ATGAGTCATC TGGAGGTGCA 150 

GCCATGCGCG GCATCTCGCG 200 

CTCGGGCAAC CGCGTGCGAA 250 

AGGCGAGCCG AGGCGACCAG 300 

TCCTGCCGCG CCAGGACGAC 350 

CGCCGCCTCG ATCTCACCGA 400 

GATGATCCGC GCGCCCGATG 450 

ACCAGGTCGC CGGTCCGGTA 500 

GGGTACGAAC CGCGCTCCGG 550 

CGAGGTTCTC GCCACCGATG 600 

GGCTCGAGTG CGCTATCGAG 650 

GCCGATCGAC GGCAGCTCGG 700 

CCCACGTCGT ATCGACCGCG 750 

CGGTAGTGCG CATCGCGCGG 800 

ACGCAGCACG CGCAACGAGC 850 

CGAGTAGCCG GCCTGGAAGA 900 

TTCTCGAGCA GGCGCGGCGG 950 

AAGCCGGGCG CCCGTCGCCA 1000 

CGAACGACAT CGAGTAGATC 1050 

ACGGCACGCT GGTTCCACGC 1100 

GGCGACGCCC TTCGGCTTGC 1150 

1172 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 390 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 
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(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

He Tyr Thr Ser Gly Thr Thr Gly Lys Pro Lys Gly Val Ala Val 

5 10 15 

Pro His Arg Ala Leu Arg Asn Leu He Ala Trp Asn Gin Arg Ala 

20 25 30 

Val Glu He Ser Ala Asp Asp Arg Val Thr Gin He Tyr Ser Met 

35 40 45 

Ser Phe Asp Ala Ala Ala Leu Glu Leu Trp Ser Thr Leu Ala Thr 

50 55 60 

Gly Ala Arg Leu His Leu Val Asp Asn Glu Leu Arg Leu Asp Pro 

65 70 75 

Pro Arg Leu Leu Glu Asn Leu Thr Arg Trp Gly He Thr He Ser 

80 85 90 

Asp Leu Pro Gly Arg Leu Leu Asp Ala Leu Leu Ala Ala Gly Asp 

95 100 105 

Phe Pro Arg Ser Leu Arg Val Leu Arg Thr Gly Gly Asp Arg Leu 

110 115 120 

Thr Arg Arg Pro Pro Arg Asp Ala His Tyr Arg Met Leu Asn Glu 

125 130 135 

Tyr Gly Pro Thr Glu Asn Ala Val Asp Thr Thr Trp Ala He Val 

140 145 150 

Ser Pro Ala Ser Glu His Gly Ser Glu Leu Pro Ser He Gly Arg 

155 160 165 

Pro Leu Asp Asn Val Gin Val Tyr Val Leu Asp Ser Ala Leu Glu 

170 175 180 

Pro Val Pro Ala Arg Val He Gly Glu Leu Tyr He Gly Gly Glu 

185 190 195 

Asn Leu Ala Arg Gly Tyr Val Asp Asp Ala Ala Leu Thr Gly Ala 

200 205 210 

Arg Phe Val Pro Asp Pro Phe Gly Ala Pro Gly Ser Arg Leu Tyr 

215 220 225 

Arg Thr Gly Asp Leu Val Arg Trp Leu Pro Asp Gly Asn Leu Asp 

230 235 240 

Phe He Gly Arg Ala Asp His Gin Val Lys He Arg Gly Phe Arg 

245 250 255 

He Glu Leu Gly Glu He Glu Ala Ala Leu Ala Asp His Pro Gly 
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260 265 270 

Val Glu Thr Ala Val Val Leu Ala Arg Gin Glu Arg Ala Gly Asp 
275 280 285 

Lys Arg Leu Val Ala Tyr Trp Ser Pro Arg Leu Ala Ser Val Ala 
290 295 300 

Ala Ser Glu Leu Arg Asp His Leu Arg Thr Arg Leu Pro Glu Tyr 
305 310 315 

Met lie Pro Ala Ala Phe Val Glu Leu Arg Glu Met Pro Arg Met 
320 325 330 

Ala Ser Gly Lys lie Asp Leu Lys Val Leu Pro Ala Pro Pro Asp 
335 340 345 

Asp Ser Ser Asp Arg Thr Asp Tyr Asp Pro Pro Ser Thr Pro Val 
350 355 360 

Glu Val Lys Leu Ala Arg He Trp Ser Gly Leu Leu Gly Val Glu 
365 370 375 

Arg Val Ser Arg Thr Asp Ser Phe Phe Ala Pro Gly Gly Pro Ser 
380 385 390 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 472 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

TTCGGCGGGT TCCAGACGGC CATGGTGCTG ACGACGGGAC GGGACAATGA 50 
GAAGTAGCGT CGCGGTCACC GGCATCGGCC TGGTGGCCGC CAACGGGCTC 100 
ACCACCGAGG ACGTGTGGTC GGCCGTGCTC GGCGGCCGCA GCGGCCTTGG 150 
AACGATCACC CGTTTCGACG CCGCGGGCTA CCCGGCCCGG ATCGCCGGCG 200 
AGGTGTCGCA GTTCGTGGCC GAGGAGCACA TCGCCGACCG GCTGATCCCG 250 
CAGACCGACC ACATGACCCG GCTGGCGCTG GCCGCGGCCG AGTCGGCGAT 3 00 
CCGGGACGCC AAGGTGGGAC CTGGCCGAGC TGCCCGATTC GGCGCGGGCG 350 
TGGTCACCGC CGCGACGGCA GGCGGCTTCG AGTTCGGCCA GCGGGAGCTG 400 
GAGAACCTGT GGCGCAAGGG GCCTGAGCAC GTCAGCCCCT ACCAGTCCTT 450 
CGCCTGGTTC TACGCCGTCA AC 472 



(2) INFORMATION FOR SEQ ID NO: 20: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 142 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Met Arq Ser Ser Val Ala Val Thr Gly He Gly Leu Val Ala Ala 

5 10 15 

Asn Gly Leu Thr Thr Glu Asp Val Trp Ser Ala Val Leu Gly Gly 

20 25 30 

Arq Ser Gly Leu Gly Thr He Thr Arg Phe Asp Ala Ala Gly Tyr 

35 40 45 

Pro Ala Arg He Ala Gly Glu Val Ser Gin Phe Val Ala Glu Glu 

50 55 60 

His He Ala Asp Arg Leu He Pro Gin Thr Asp His Met Thr Arg 

65 70 - 75 

Leu Ala Leu Ala Ala Ala Glu Ser Ala He Arg Asp Ala Lys Val 

80 85 90 

Gly Pro Gly Arg Ala Ala Arg Phe Gly Ala Gly Val Val Thr Ala 

95 100 105 

Ala Thr Ala Gly Gly Phe Glu Phe Gly Gin Arg Glu Leu Glu Asn 
110 115 120 

Leu Trp Arg Lys Gly Pro Glu His Val Ser Pro Tyr Gin Ser Phe 
125 130 135 

Ala Trp Phe Tyr Ala Val Asn 
140 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 637 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: HO 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:21: 
TATATTACTC CAGGTTGCTT ACGAAGCATT GGAGATGTCC GGATATTTCG 50 
CCGATTCGTC CAGGCCTGAG GATGTCGGTT GCTATATTGG AGCTTGTGCA 100 
ACAGATTACG ATTTCAACGT AGCATCCCAT CCTCCCACGG CGTATTCAGC 150 
GACTGGCACG CTCCGATCTT TTCTAAGTGG CAAGCTGTCG CATTACTTTG 200 
GTTGGTCCGG TCCCTCTCTT GTCCTAGACA CTGCCTGCTC TTCGTCGGCG 2 50 
GTGGCTATTC ATACTGCATG TACTGCTTTG AGGACTGGCC AGTGTTCTCA 3 00 
AGCTCTAGCA GGCGGGATCA CGTTGATGAC AAGCCCGTAT CTCTATGAGA 3 50 
ACTTCTCTGC AGCCCATTTC TTGAGTCCAA CGGGAGGTTC AAAGCCGTTC 4 00 
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AGCGCAGRTG CAGATGGATA CTGTAGAGGA GAAGGTGGTG GCCTCGTGGT 450 

CTTGAAACGA CTTTCAGATG CTCTCAGGGA TGATGACCAT ATTATTAGTG 500 

TCATCGCTGG CTCGGCGGTC AACCAGAACG ACAACTGCGT GCCTATCACC 550 

GTCCCTCACA CTTCGTCTCA GGGAAATCTC TATGAACGAG TTACCAGACA 600 

GGCAGGGGTG ACACCCAATA AAGTCACTTT TGTGGAA 637 

(2) INFORMATION FOR SEQ ID NO:22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 212 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 
lie Leu Leu Gin Val Ala Tyr Glu Ala Leu Glu Met Ser Gly Tyr 

5 10 15 

Phe Ala Asp Ser Ser Arg Pro Glu Asp Val Gly Cys Tyr lie Gly 

20 25 30 

Ala Cys Ala Thr Asp Tyr Asp Phe Asn Val Ala Ser His Pro Pro 

35 40 45 

Thr Ala Tyr Ser Ala Thr Gly Thr Leu Arg Ser Phe Leu Ser Gly 

50 55 60 

Lys Leu Ser His Tyr Phe Gly Trp Ser Gly Pro Ser Leu Val Leu 

65 70 75 

Asp Thr Ala Cys Ser Ser Ser Ala Val Ala lie His Thr Ala Cys 

80 85 90 

Thr Ala Leu Arg Thr Gly Gin Cys Ser Gin Ala Leu Ala Gly Gly 

95 100 105 

lie Thr Leu Met Thr Ser Pro Tyr Leu Tyr Glu Asn Phe Ser Ala 
110 115 120 

Ala His Phe Leu Ser Pro Thr Gly Gly Ser Lys Pro Phe Ser Ala 
125 130 135 

Xaa Ala Asp Gly Tyr Cys Arg Gly Glu Gly Gly Gly Leu Val Val 
140 145 150 

Leu Lys Arg Leu Ser Asp Ala Leu Arg Asp Asp Asp His lie lie 
155 160 165 

Ser Val lie Ala Gly Ser Ala Val Asn Gin Asn Asp Asn Cys Val 
170 175 180 

Pro lie Thr Val Pro His Thr Ser Ser Gin Gly Asn Leu Tyr Glu 
185 190 195 
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Arq Val Thr Arg Gin Ala Gly Val Thr Pro Asn Lys Val Thr Phe 
200 205 210 



Val Glu 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1177 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 
GCACGACGGG CAAGCCCAAG GGGGGCGATG AACAGCCATC GAGGAATTTG 50 
CAATCGCTTA CTGTGGATGC AAGATGCTTA CAAACTAACT GAAACTGATC 100 
GCGTTCTGCA AAAAACGCCT TTTAGTTTCG ACGTTTCCGT TTGGGAGTTT 150 
TTCTGGCCTC TCTTGACAGG GGCGCGTTTA GTGATGGCTC AACCAGGCGG 200 
ACAGCGAGAT GCAACTTACT TAATTAACAC CATCGTCCAA GAGGAAATTA 2 50 
CAACACTGCA TTTTGTCCCC TCCATGTTGC GGATATTTCT CCAAACTAAA 300 
GGGCTAGAAC GTTGTCAATC TCTAAAACGG GTGTTTTGTA GTGGAGAAGC 350 
CTTACCAGTT GACCTCCAGG AGCGGTTTTT TGACTCGATG GGATGTGAAC 4 00 
TACACAACCT CTATGGTCCT ACCGAAGCGG CAATTGATGT CACATTTTGG 450 
CAGTGTCAAA GAGAGAGTAA CTTAAAAAGT GTACCGATTG GGAGAGCGAT 500 
CGCCAACACT CAAMTTTATA TCCTCGACTC CCATTTACAA GCAGTTCCCT 550 
TGGGTGCGAT CGGCGAACTT TATATTGGTG GTATCGGCGT TGCTAGAGGS 600 
TATCTTAACC GTCCAGACTT AACAGCCGAG CGATTTATTT CCCATCCCTT 650 
TAAGGAAGGC GRRAAACTTT ACAAAACAGG AGACTTAGCC CGATATCTGG 700 
CCGATGGCAA TATCGAATAC ATCGGTAGAA TTGATCATCA AGTAAAAATT 750 
CGGGGTTTCC GCATCGAACT TGGAGAAATC GAAACTTTAC TAGCACAACA 800 
CCCGACCATA CAGCAAACTG TCGTCACAGC TAGAATTGAT CATCTCGAAA 850 
ACCAGCGATT AGTCGCCTAC ATCGTTCCTC ATTCAGAGCA GACACTAACC 900 
ACAGACGAAC TGCGCCACTT CCTCAAAAAG AAACTGCCAG AATATATGGT 950 
GCCTAGTACT TTCGTTTTCC TAGACACTCT ACCCCTAACC CCCAACGGCA 10 00 
AAATTGACCG TCGCGCTTTA CCAGCACCCG ACTCAACAAG GCTTGATTCA 1050 
GAAAACACAT ATCTTGCTCC CCGCGATTAA TTAGAATTTC AGTTGACTAA 1100 
AATTTGGTCA GAAATTTTAG GTATCCAGCC TATCGGTGTC AGGGACAACT 1150 
TCTTCTTCCT TGGGCGGCCC CTCCCTT 1177 

(2) INFORMATION FOR SEQ ID N0:24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 92 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Ala Arq Arg Ala Ser Pro Arg Gly Ala Met Asn Ser His Arg Gly 

5 10 15 
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lie Cys Asn Arg.Leu Leu Trp Met Gin Asp Ala Tyr Lys Leu Thr 

20 25 30 

Glu Thr Asp Arg Val Leu Gin Lys Thr Pro Phe Ser Phe Asp Val 

35 40 45 

Ser Val Trp Glu Phe Phe Trp Pro Leu Leu Thr Gly Ala Arg Leu 

50 55 60 

Val Met Ala Gin Pro Gly Gly Gin Arg Asp Ala Thr Tyr Leu He 

65 70 75 

Asn Thr He Val Gin Glu Glu He Thr Thr Leu His Phe Val Pro 

80 85 90 

Ser Met Leu Arg He Phe Leu Gin Thr Lys Gly Leu Glu Arg Cys 

95 100 105 

Gin Ser Leu Lys Arg Val Phe Cys Ser Gly Glu Ala Leu Pro Val 
110 115 120 

Asp Leu Gin Glu Arg Phe Phe Asp Ser Met Gly Cys Glu Leu His 
125 130 135 

Asn Leu Tyr Gly Pro Thr Glu Ala Ala He Asp Val Thr Phe Trp 
140 145 150 

Gin Cys Gin Arg Glu Ser Asn Leu Lys Ser Val Pro He Gly Arg 
155 160 165 

Ala He Ala Asn Thr Gin Xaa Tyr He Leu Asp Ser His Leu Gin 
170 175 180 

Ala Val Pro Leu Gly Ala He Gly Glu Leu Tyr He Gly Gly He 
185 190 195 

Gly Val Ala Arg Gly Tyr Leu Asn Arg Pro Asp Leu Thr Ala Glu 
200 205 210 

Arg Phe He Ser His Pro Phe Lys Glu Gly Gly Lys Leu Tyr Lys 
215 220 225 

Thr Gly Asp Leu Ala Arg Tyr Leu Ala Asp Gly Asn He Glu Tyr 
230 235 240 

He Gly Arg He Asp His Gin Val Lys He Arg Gly Phe Arg He 
245 250 255 

Glu Leu Gly Glu He Glu Thr Leu Leu Ala Gin His Pro Thr He 
260 265 270 

Gin Gin Thr Val Val Thr Ala Arg He Asp His Leu Glu Asn Gin 
275 280 285 

Arg Leu Val Ala Tyr He Val Pro His Ser Glu Gin Thr Leu Thr 
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290 295 300 

Thr Asp Glu Leu Arg His Phe Leu Lys Lys Lys Leu Pro Glu Tyr 
305 310 315 

Met Val Pro Ser Thr Phe Val Phe Leu Asp Thr Leu Pro Leu Thr 
320 325 330 

Pro Asn Gly Lys He Asp Arg Arg Ala Leu Pro Ala Pro Asp Ser 
335 340 345 

Thr Arg Leu Asp Ser Glu Asn Thr Tyr Leu Ala Pro Arg Asp Xaa 
350 355 360 

Leu Glu Phe Gin Leu Thr Lys He Trp Ser Glu He Leu Gly He 
365 370 375 

Gin Pro He Gly Val Arg Asp Asn Phe Phe Phe Leu Gly Arg Pro 
380 385 390 



Leu Pro 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 406 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

Met Ser He Arg Thr Val Val Thr Gly Leu Gly He Ala Ala Pro 

5 10 15 

Asn Gly Leu Gly He Glu Glu Tyr Trp Ser Ala Thr Leu Ala Gly 

20 25 30 

Arg Gly Ala He Gly Pro Leu Thr Arg Phe Asp Ala Ser Ser Tyr 

35 40 45 

Pro Ser Arg Leu Ala Gly Glu He Arg Gly Phe Thr Ala Ala Glu 

50 55 60 

His Leu Pro Gly Arg Leu Leu Pro Gin Thr Asp Arg Met Thr Gin 

65 70 75 

Leu Ala Leu Val Ser Ala Gly Trp Ala Leu Asp Asp Ala Gly Val 

80 85 90 



Val Pro Asp Glu Leu Pro Ala Tyr Asp Met Gly Val He Thr Ala 

95 100 105 
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Ser His Ala Gly Gly Phe Glu Phe Gly Gin Asn Glu Leu Lys Ala 
110 115 120 

Leu Trp Ser Lys Gly Gly Lys Tyr Val Ser Ala Tyr Gin Ser Phe 
125 130 135 

Ala Trp Phe Tyr Ala Val Asn Ser Gly Gin lie Ser lie Arg Asn 
140 145 150 

Gly Met Arg Gly Pro Ser Gly Val Val Val Ser Asp Gin Ala Gly 
155 160 165 

Gly Leu Asp Ala Leu Ala Gin Ala Arg Arg Gin lie Arg Lys Gly 
170 175 180 

Thr Pro Leu He Val Ser Gly Ala Val Asp Ala Ser Leu Cys Thr 
185 190 195 

Trp Gly Trp Val Ala Gin Leu Ala Gly Gly Arg Leu Ser Arg Ser 
200 205 210 

Asp Asp Pro Gly His Ala Tyr Val Pro Phe Asp Asp Ala Ala Val 
215 220 225 

Gly His Val Pro Gly Glu Gly Gly Ala Leu Leu He Leu Glu Glu 
230 235 240 

Ala Glu His Ala Arg Ser Arg Gly Ala Arg Arg He Tyr Gly Glu 
245 250 255 

He Thr Gly His Ala Ser Thr Phe Asp Pro Pro Pro Trp Ser Gly 
260 265 270 

Arg Gly Pro Ala Val Gin Arg Val He Glu Glu Ala Leu Ala Asp 
275 280 285 

Ala Gly Thr Val Pro Asp Glu Val Asp Val Val Phe Ala Asp Ala 
290 295 300 

Ala Ala Leu Pro Glu Leu Asp Arg He Glu Ala Ala Ala He Thr 
305 310 315 

Lys Val Phe Gly Pro His Ala Val Pro Val Thr Ala Pro Lys Thr 
320 325 330 

Met Thr Gly Arg Leu Tyr Ser Gly Ala Ala Pro Leu Asp Val Ala 
335 340 345 

Ala Ala Cys Leu Ala He Arg Asp Gly Leu He Pro Pro Thr He 
350 355 360 

His Ser Ser Leu Ser Gly Arg Tyr Glu He Asp Leu Val Thr Gly 
365 370 375 



Ala Pro Arg Thr Ala Pro Val Arg Thr Ala Leu Val Val Ala Arg 
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380 385 390 

Gly His Gly Gly Phe Asn Ser Ala Val Val Val Arg Ala Pro Arg 

395 400 405 

Asp 

(2) INF0R^4ATI0N FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 415 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Met Thr Ser Glu Leu Leu Gl.u Arg Thr Ala Val Arg Ser Ala Thr 

5 10 15 

Ala Val Phe Thr Gly He Gly Val Thr Ala Pro Asn Gly Leu Gly 

20 25 30 

Thr Ala Ala Trp Trp Gin Ala Thr Val Ala Gly Glu Ser Gly He 

35 40 45 

Arg Pro Val Ser Arg Phe Asp Ala Ser Gly Tyr Pro Ser Thr Leu 

50 55 60 

Ala Gly Glu Val Pro Gly Phe Asp Ala Glu Glu His He Pro Ser 

65 70 75 

Arg Leu Leu Ser Gin Thr Asp His Met Thr Arg Leu Ala Leu Thr 

80 85 90 

Ala Ala Lys Glu Ala Leu Glu Asp Ser Gly Ala Asp Pro Ala Glu 

95 100 105 

Met Pro Gin Tyr Ser Ala Gly Ala Val Thr Ala Ala Ser Ala Gly 

110 115 120 

Gly Phe Glu Phe Gly Gin Arg Glu Leu Gin Ala Leu Trp Ser Lys 

125 130 135 

Gly Gly Gin Tyr Val Ser Ala Tyr Gin Ser Tyr Ala Trp Phe Tyr 

140 145 150 

Ala Val Asn Thr Gly Gin He Ser He Arg His Gly Leu Arg Gly 

155 160 165 

Pro Ser Gly Val Leu Val Thr Glu Gin Ala Gly Gly Leu Glu Ala 

170 175 180 
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Val Ala Gin Ala Arg Arg Gin Leu Arg Lys Gly Ser Lys Leu He 
185 190 195 

Val Thr Gly Gly Val Asp Gly Ala Val Cys Pro Trp Gly Trp Thr 
200 205 ■ 210 

Ala Gin Leu Ala Gly Gly Arg Met Ser Pro Val Ala Asp Pro Ala 
215 220 225 

Arq Ala Phe Leu Pro Phe Asp Ser Glu Ala Ser Gly Tyr Val Ala 
230 235 240 

Gly Glu Gly Gly Ala He Leu Val Leu Glu Asp Ala Glu Ala Ala 
245 250 255 

Arg Glu Arg Gly Ala Arg He Tyr Gly Arg Leu Ser Gly Tyr Ala 
260 265 270 

Ala Thr Phe Asp Pro Ala Pro Gly Arg Gly Gly Glu Pro Gly Leu 
275 280 285 

Arg Arg Ala Ala Glu Leu Ala Leu Thr Glu Ala Gly Leu Ser Ala 
290 295 300 

Ser Asp Val Asp Val Val Phe Ala Asp Ala Ser Gly Val Pro Glu 
305 310 315 

Leu Asp Arg Gin Glu Glu Ala Ala Leu Thr Ala Leu Phe Gly Pro 
320 325 330 

Arg Gly Val Pro Val Thr Ala Pro Lys Thr Met Thr Gly Arg Leu 
335 340 345 

Ser Ala Gly Gly Ala Ser Leu Asp Leu Ala Ala Ala Leu Leu Ser 
350 355 360 

He Arq Asp Ala Val He Pro Pro Thr Val Asn Val Thr Ser Pro 
365 370 375 

Val Ala Ala Asp Ala Leu Asp Leu Val Thr Glu Ala Arg Arg Gly 
380 385 390 

Pro Val Arg Thr Ala Leu Val Leu Ala Arg Gly Thr Gly Gly Phe 
395 400 405 

Asn Ala Ala Ala Val Val Thr Ala Ala Asn 
410 415 

(2) INFORMATION FOR SEQ ID NO:27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 403 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 
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(iii) HYPOTHETICAL: no 
(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
Met He Pro Val Ala Val Thr Gly Met Gly Val Ala Ala Pro Asn 

5 10 15 

Gly Leu Gly Ala Ala Asp Tyr Trp Ala Ala Thr Arg Gly Gly Lys 

20 25 30 

Ser Gly He Gly Arg He Thr Arg Phe Asp Pro Ser Ser Tyr Pro 

35 40 45 

Ala Arg Leu Ala Gly Glu He Pro Gly Phe Glu Ala Ala Glu His 

50 55 60 

Leu Pro Gly Arg Leu Leu Pro Gin Thr Asp Arg Val Thr Arg Leu 

65 70 75 

Ser Leu Ala Ala Ala Asp Trp Ala Leu Ala Asp Ala Gly Val Glu 

80 85 90 

Pro Glu Ser Phe Asp Pro Leu Asp Met Gly Val Val Thr Ala Gly 

95 100 105 

His Ala Gly Gly Phe Glu Phe Gly Gin Gly Glu Leu Gin Lys Leu 
110 115 120 

Trp Ala Lys Gly Ser Gin Phe Val Ser Ala Tyr Gin Ser Phe Ala 
125 130 135 

Trp Phe Tyr Ala Val Asn Ser Gly Gin He Ser He Arg His Gly 
140 145 150 

Met Lys Gly Pro Asn Gly Val Val Val Ser Asp Gin Ala Gly Gly 
155 160 165 

Leu Asp Ala Leu Ala Gin Ala Arg Arg Leu Val Arg Lys Gly Thr 
170 175 180 

Pro Leu He Val Cys Gly Ala Val Asp Ala Ser He Cys Pro Trp 
185 190 195 

Gly Trp Val Ala Gin Leu Ala Gly Gly Arg Met Ser Asp Ser Asp 
200 205 210 

Glu Pro Ala Arg Ala Tyr Leu Pro Phe Asp Arg Asp Ala Arg Gly 
215 220 225 

Tyr Leu Pro Gly Glu Gly Gly Ala He Leu He Met Glu Pro Ala 
230 235 240 

Ala Ala Ala Arg Ala Arg Gly Ala Lys Val Tyr Gly Glu He Ser 
245 250 255 

Gly Tyr Gly Ala Thr Phe Asp Pro Pro Pro Gly Ser Gly Ser Gly 
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260 265 270 

Ser Thr Leu Arg Thr Ala lie Arg Val Ala Leu Asp Asp Ala Gly 

275 280 285 

Val Ala Pro Gly Asp Val Asp Ala Val Phe Ala Asp Gly Ala Gly 

290 295 300 

Val Pro Glu Leu Asp Arg Ala Glu Ala Glu Ala lie Thr Asp Val 

305 310 315 

Phe Gly Ser Gly Gly Val Pro Val Thr Val Pro Lys Thr Met Thr 

320 325 330 

Gly Arg Leu Tyr Ser Gly Ala Ala Pro Leu Asp Val Ala Cys Ala 

335 340 345 

Leu Leu Ala Met Gin Ala Gly Val He Pro Pro Thr Val His He 

350 355 360 

Asp Pro Cys Pro Glu Tyr Gly Leu Asp Leu Val Leu His Gin Ala 

365 370 375 

Arg Pro Ala Thr Val Arg Thr Ala Leu Val Leu Ala Arg Gly His 

380 385 390 

Gly Gly Phe Asn Ser Ala Met Ala Val Arg Ala Gly Arg 

395 400 



(2) INFORMATION FOR SEQ ID NO: 28 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 407 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28 
Met Ser Ala Arg Phe Leu Val Thr Gly He Gly Val Ala Ala Pro 

5 10 15 

Ser Gly Leu Gly Val Glu Asp Phe Trp Ser Val Thr Arg He Gly 

20 25 30 

Lys Asn Ala He Gly Pro Val Thr Arg Phe Asp Ala Ser Ala Tyr 

35 40 45 

Pro Ser Arg Leu Ala Gly Glu He His Gly Phe Glu Pro Lys Glu 

50 55 60 

His Leu Pro Gly Arg Leu Val Pro Gin Thr Asp Arg Val Thr Gin 

65 70 75 
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Leu Ala Leu 
Glu Pro Gly 
Ala Gly Ala 
Leu Trp Ser 
Ala Trp Phe 
Gly Leu Arg 
Gly Leu Asp 
Ser Lys Leu 
Leu Gly Trp 
Glu Arg Thr 
Ala Ala Gly 
Glu Asp Glu 
Gly Glu Phe 
Ser Gly Arg 
Thr Asp Ala 
Asp Gly Ala 
He Thr Ala 
Lys Thr Met 
Val Val Ser 



Val Ala Ala 
80 

Thr lie Asp 
95 

Gly Gly Phe 
110 

Glu Gly Ala 
125 

Tyr Ala Val 
140 

Gly Pro Ala 
155 

Ala Leu Ala 
170 

He Ala Thr 
185 

Ala Ser Gin 
200 

Glu Pro Glu 
215 

Tyr Val Pro 
230 

Asp Ser Ala 
245 

Ala Gly Tyr 
260 

Glu Pro Gly 
275 

Ala Cys His 
290 

Ala Thr Pro 
305 

Val Phe Gly 
320 

Thr Gly Arg 
335 

Ala Val Leu 
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Asp Cys Ala 

Pro Tyr Ala 

Glu Phe Ala 

Lys Arg Val 

Asn Ser Gly 

Gly Val Val 

Gin Ala Arg 

Gly Gly Phe 

Pro Arg Thr 

Arg Ala Tyr 

Gly Glu Gly 

Arg Asp Arg 

Gly Ala Thr 

Leu Arg Arg 

Pro Ala Glu 

Arg Leu Asp 

Pro Arg Ala 

He Asn Ser 

Ser Met Arg 



Phe Ala Asp 
85 

Met Gly Val 
100 

Glu Asn Glu 
115 

Ser Ala Tyr 
130 

Gin lie Ser 
145 

He Ser Asp 
160 

Arg Gin Leu 
175 

Asp Ala Pro 
190 

Gly Gly Leu 
205 

Leu Pro Phe 
220 

Gly Ala Met 
235 

Gly Ala Arg 
250 

Leu Asp Pro 
265 

Ala He Asp 
280 

Val Glu Val 
295 

Arg Glu Glu 
310 

Val Pro Val 
325 

Gly Gly Ala 
340 

Glu Gly Leu 
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Ala Gly He 
90 

Val Thr Ala 
105 

Leu Arg Lys 
120 

Gin Ser Phe 
135 

He Arg Asn 
150 

Gin Ala Gly 
165 

Arg Lys Gly 
180 

He Cys Ser 
195 

Met Phe His 
210 

Glu Asp Ala 
225 

Leu He Leu 
240 

Thr Val Tyr 
255 

Lys Pro Gly 
270 

Val Ala Leu 
285 

Val Phe Ala 
300 

Ala Glu Ala 
315 

Thr Val Pro 
330 

Pro He Asp 
345 

He Pro Pro 
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350 355 360 

Thr Thr Asn Val Glu Leu Ser Asp Ala Tyr Asp Leu Asp Leu Val 
365 370 375 

Ala Val Arg Pro Arg Thr Ala Ser Val Arg Thr Ala Leu Val Leu 
380 385 390 

Ala Arg Gly Arg Gly Gly Phe Asn Ser Ala Val Val Val Arg Ala 
395 400 405 

Val Asp 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 643 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
GGATCTGCTT GAGGTAGTCT ACGAGGCACT GGAGTCAGCA GGGTACTTTG 50 
GCGCCAAGTC AAACCCGGAA CCTGATGACT ATGGATGCTA TATCGGTGCA 100 
GTGATGAACA ACTACTATGA CAACGTTTCT TGCCATCCAC CCACCGCATA 150 
CGCTACTCTT GGAACGTCGC GTTGCTTCCT TAGTGGCTGC ATGAGCCATT 2 00 
ACTTTGGATG GACGGGACCT TCCTTGACCA TTGATACGGC TTGCTCGTCA 250 
TCACTAGTTG CTATAAACAC CGCTTGTAGA GCAATATGGT CTGGTGAGTG 300 
CTCCCGGGCC ATAGCTGGGG GTACCAATGT CTTCACAAGT CCGTTTGACT 350 
ACCAGAATCT TCGCGCCGCA GGATTCCTCA GCCCTAGCGG GCAATGCAAG 4 00 
CCGTTTGATG CTTCTGCTGA TGGCTACTGC CGTGGAGAAG GAGTTGGTGT 450 
CGTTGTGCTT AAGCCTTTGA CGGCTGCTAT GCAAGAGAAC GATAACATCC 500 
TTGGCGTCAT TGTGGGGTCT GCAGCAAACC AAAACCAAAA CCTCAGTCAT 550 
ATCACGGTGC CCCATTCGGG CTCACAAGTC CAGCTTTATC GAAAGGTGAT 600 
GAAGCTTGCA GGTATAGAGC CAGAGTCAGT CTCCTACGTT GAG 643 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 212 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:30: 

lie Leu Leu Gin Val Ala Tyr Glu Ala Leu Glu Met Ser Gly Tyr 

5 10 15 

Phe Ala Asp Ser Ser Arg Pro Glu Asp Val Gly Cys Tyr lie Gly 

20 25 30 



Ala Cys Ala Thr Asp Tyr Asp Phe Asn Val Ala Ser His Pro Pro 
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35 40 45 

Thr Ala Tyr Ser Ala Thr Gly Thr Leu Arg Ser Phe Leu Ser Gly 

50 55 60 

Lys Leu Ser His Tyr Phe Gly Trp Ser Gly Pro Ser Leu Val Leu 
^ 65 70 75. 

AsD Thr Ala Cys Ser Ser Ser Ala Val Ala He His Thr Ala Cys 
^ 80 85 90 

Thr Ala Leu Arg Thr Gly Gin Cys Ser Gin Ala Leu Ala Gly Gly 

95 100 105 

He Thr Leu Met Thr Ser Pro Tyr Leu Tyr Glu Asn Phe Ser Ala 
110 115 120 

Ala His Phe Leu Ser Pro Thr Gly Gly Ser Lys Pro Phe Ser Ala 
125 130 135 

Xaa Ala Asp Gly Tyr Cys Arg Gly Glu Gly Gly Gly Leu Val Val 
140 145 150 

Leu Lys Arg Leu Ser Asp Ala Leu Arg Asp Asp Asp His He He 
155 160 165 

Ser Val He Ala Gly Ser Ala Val Asn Gin Asn Asp Asn Cys Val 
170 175 180 

Pro He Thr Val Pro His Thr Ser Ser Gin Gly Asn Leu Tyr Glu 
185 190 195 

Arg Val Thr Arg Gin Ala Gly Val Thr Pro Asn Lys Val Thr Phe 
200 205 210 

Val Glu 



(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 643 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
AATCCTCATG GAATCAGCTT GGCAAACACT AGAAAACGCT GGCATAACTG 50 
CGAACAAAGT AGCTGGCAGC AGTACAGGAG TTTTTGTGGG TGCTAGTGGC 100 
TCTGATTACT GTTGGGTAAT GGAGCGGGTA GGTATTCCCA TAGAAGCTCA 150 
CGTTGCAACG GGCACGTCGT TGGCAGCGCT GGCAAATCGC ATCTCTTACT 200 
TTTTTGACTT GCGAGGCCCA AGCATCGTCA TTGATACGGC GTGTTCTAGT 2 50 
TCGTTGATGG CAGTGCATCA GGCGGTTCAA TCTATCCGAG CAGGTGAGTG 3 00 
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CTTACAAGCA CTGGTGGGCG GTATACATAT CATGAGCCAT CCGGCTAACA 3 50 

GTATTGCATA TTACAAGGCT GGGATGTTGG CGCATGATGG CAAGTGCAAG 4 00 

ACATTTGACG ATCGCGCAGA TGGGTACGTT CGCAGTGAAG GCGCTGTGAT 450 

GCTTCTGCTC AAGCAATTGC ATCAGGCGGA AGCAGATGGC GATCTAATTT 500 

ATGCGACAAT CAAGGGGTCA GCCTCGAATC ATGGTGGACA GTCCGCCGGC 550 

CTCACCGTAC CGAATCCGCA ACAGCAGGCA GCACTCTTAA CCAATGCCTG 600 

GAAAGCCTCT GGTGTAGACC CTAACACGAT TAGTTTTATC GAA 643 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 214 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

He Leu Met Glu Ser Ala Trp Gin Thr Leu Glu Asn Ala Gly He 

5 10 15 

Thr Ala Asn Lys Val Ala Gly Ser Ser Thr Gly Val Phe Val Gly 

20 25 30 

Ala Ser Gly Ser Asp Tyr Cys Trp Val Met Glu Arg Val Gly He 

35 40 45 

Pro He Glu Ala His Val Ala Thr Gly Thr Ser Leu Ala Ala Leu 

40 55 60 

Ala Asn Arg He Ser Tyr Phe Phe Asp Leu Arg Gly Pro Ser He 

65 70 75 

Val He Asp Thr Ala Cys Ser Ser Ser Leu Met Ala Val His Gin 

80 85 90 

Ala Val Gin Ser He Arg Ala Gly Glu Cys Leu Gin Ala Leu Val 

95 100 105 

Gly Gly He His He Met Ser His Pro Ala Asn Ser He Ala Tyr 

110 115 120 

Tyr Lys Ala Gly Met Leu Ala His Asp Gly Lys Cys Lys Thr Phe 

125 130 135 

Asp Asp Arg Ala Asp Gly Tyr Val Arg Ser Glu Gly Ala Val Met 

140 145 150 

Leu Leu Leu Lys Gin Leu His Gin Ala Glu Ala Asp Gly Asp Leu 

155 160 165 



He Tyr Ala Thr He Lys Gly Ser Ala Ser Asn His Gly Gly Gin 
170 175 180 
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Ser Ala Gly Leu Thr Val Pro Asn Pro Gin Gin Gin Ala Ala Leu 

185 190 195 

Leu Thr Asn Ala Trp Lys Ala Ser Gly Val Asp Pro Asn Thr He 

200 205 210 

Ser Phe He Glu 



(2) INFORMATION FOR SEQ ID NO:33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 637 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: 
TATATTACTC CAGGTTGCTT ACGAAGCATT GGAAATGTCC GGGTATTTCG 50 
CCGACTCGTC CAAGCCTGAG GACGTAGGTT GCTATATTGG AGCTTGTGCA 100 
ACAGATTACG ATTTCAGCGT AGCGTCCCAT CCTCCTACGG CATACTCAGC 150 
AACTGGCACG CTCCGATCTT TCCTGAGTGG CAAGCTGTCA CATTACTTTG 2 00 
GTTGGTCTGG TCCCTCTCTT GTCCTGGACA CCGCCTGCTC TTCATCGGCG 250 
GTGGCCATTC ACACTGCATG TACTGCTTTG AGGACTGGCC AGTGTTCTCA 3 00 
GGCTTTAGCA GGCGGGATTA CTTTGATGAC CAGCCCGTAT CTCTTTGAGA 3 50 
ACTTTGCTGC CGCCCATTTC TTGAGCCCAA CGGGAGGCTC AAAGCCGTTC 400 
AGTGCAGATG CAGATGGGTA TTGTAGAGGA GAAGGGGGTG GGCTCGTGGT 450 
CTTGAAACGA CTTTCAGATG CTATCAGGGA TAACGACCAC ATCATTAGCG 500 
TCATCGCTGG CTCAGCCGTC AACCAGAACG CTAACTGTGT GCCTATCACC 550 
GTCCCTCATA CTTCGTCTCA GGGCAATCTC TATGAACGAG TTACCGCACA 600 
GGCAGGGGTG ACACCTAATA AGGTCACTTT TGTGGAA 637 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 212 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: 
He Leu Leu Gin Val Ala Tyr Glu Ala Leu Glu Met Ser Gly Tyr 

5 10 15 

Phe Ala Asp Ser Ser Lys Pro Glu Asp Val Gly Cys Tyr He Gly 

20 25 30 

Ala Cys Ala Thr Asp Tyr Asp Phe Ser Val Ala Ser His Pro Pro 

35 40 45 

Thr Ala Tyr Ser Ala Thr Gly Thr Leu Arg Ser Phe Leu Ser Gly 

50 55 60 
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Lys Leu Ser His Tyr 

65 

Asp Thr Ala Cys Ser 

80 

Thr Ala Leu Arg Thr 

95 

He Thr Leu Met Thr 
110 

Ala His Phe Leu Ser 
125 

Asp Ala Asp Gly Tyr 
140 

Leu Lys Arg Leu Ser 
155 

Ser Val He Ala Gly 
170 

Pro He Thr Val Pro 
185 

Arg Val Thr Ala Gin 
200 

Val Glu 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 691 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
CCATCTGCTA GAAATCAGCT ACGAGGCGCT CGAGAATGCA GGCTTTCCAC 50 
TGCCTAGCAT TGCTGGCACG AACATGGGTG TCTTTGTCGG CGGAAGCAAC 100 
TCTGAGTATC GAGCGCACAT CGGAAACGAT ACCGACAACT TACCGATGTT 150 
TGAAGCAACA GGCAATGCAG AATCTCTGCT GGCGAATCGA GTCTCTTATG 200 
TGTATGATCT CCACGGCGCA AGTCTGACGA TTGGTACCGC TTGTTCCGTC 250 
GAGTTTAGCA GCTTTGGATA GCGCGTTTCT CAGCTTGCAG CTGGTAAGTC 300 
GTCCACAGCA ATTGTTGCCG GCTCCGTTGT TCGAATCGTA CCGTCATCGA 350 
CCATCrCACC TTCTACTATG AAGTAAGCAG TCATGGCTCT TGACACGGAG 4 00 
ACTACTCACC ATTCCAGGCT TCTGTCACCA GAAGGGCGGT GTTATGCGTT 450 
CGATGACAGA GCCACTAGTG GTTTTGGAAG GGGTGAAGGT TCTGCCTGCA 500 
TAATATTGGA AACCTTAGAG GCAGCCTTAA GAGACAACGA CCCAATCCGA 550 
TCGGTCATTC GCAATTCGGG AGTCAATCAA GATGGTAAAA CTGCAGGTAT 600 



Phe Gly Trp Ser Gly Pro Ser Leu Val Leu 

70 75 

Ser Ser Ala Val Ala He His Thr Ala Cys 

85 90 

Gly Gin Cys Ser Gin Ala Leu Ala Gly Gly 
100 105 

Ser Pro Tyr Leu Phe Glu Asn Phe Ala Ala 
115 120 

Pro Thr Gly Gly Ser Lys Pro Phe Ser Ala 
130 135 

Cys Arg Gly Glu Gly Gly Gly Leu Val Val 
145 150 

Asp Ala He Arg Asp Asn Asp His He He 
160 165 

Ser Ala Val Asn Gin Asn Ala Asn Cys Val 
175 180 

His Thr Ser Ser Gin Gly Asn Leu Tyr Glu 
190 195 



Ala Gly Val Thr Pro Asn Lys Val Thr Phe 
205 210 



wo 98/53097 



PCT/CA98/00488 



-44- 

CACAATGCCA AATGGGGAAG CGCAAGCTTC ATTGATACAA TCTGTTTATC 650 
GCACTGCTGG ATTGGACCCT CTGCAGACAG ATTACGTCGA G 691 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 215 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

His Leu Leu Glu lie Ser Tyr Glu Ala Leu Glu Asn Ala Gly Phe 

5 10 15 

Pro Leu Pro Ser He Ala Gly Thr Asn Met Gly Val Phe Val Gly 

20 25 30 

Gly Ser Asn Ser Glu Tyr Arg Ala-His He Gly Asn Asp Thr Asp 

35 40 45 

Asn Leu Pro Met Phe Glu Ala Thr Gly Asn Ala Glu Ser Leu Leu 

50 55 60 

Ala Asn Arg Val Ser Tyr Val Tyr Asp Leu His Gly Ala Ser Leu 

65 70 75 

Thr He Gly Thr Ala Cys Ser Val Glu Phe Ser Ser Phe Gly Xaa 

80 85 90 

Arg Val Ser Gin Leu Ala Ala Gly Lys Ser Ser Thr Ala He Val 

95 100 105 

Ala Gly Ser Val Val Arg He Val Pro Ser Ser Thr He Ser Pro 

110 115 120 

Ser Thr Met Lys Leu Leu Ser Pro Glu Gly Arg Cys Tyr Ala Phe 

125 130 135 

Asp Asp Arg Ala Thr Ser Gly Phe Gly Arg Gly Glu Gly Ser Ala 

140 145 150 

Cys He He Leu Glu Thr Leu Glu Ala Ala Leu Arg Asp Asn Asp 

155 160 165 

Pro He Arg Ser Val He Arg Asn Ser Gly Val Asn Gin Asp Gly 

170 175 180 

Lys Thr Ala Gly He Thr Met Pro Asn Gly Glu Ala Gin Ala Ser 

185 190 195 

Leu He Gin Ser Val Tyr Arg Thr Ala Gly Leu Asp Pro Leu Gin 

200 205 210 
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Thr Asp Tyr Val Glu 
215 



(2) INFORMATION FOR SEQ ID NO: 37 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 68 0 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
AACTGTTAGA GGTCAGTTAC GAGGCGTTTG AGAATGCGGG CATATCATTA 50 
TCGAGTGTTG CAGGTACCGA CGTTGGGGTA TTCATCAGTG CCAGCACCAA 100 
■TGATTACCGT TTCGTTTTCC ACAACGACCT CGACACATTG CCAATGTTTG 150 
AATCCACTGG GAGTGAATTA TCGATCATGT CCAATCGTAT CTCCTATACT 2 00 
TTCAATCTTA GAGGTCCAAG TATGACGATT GATACTCCCT GTTCCTCAAG 250 
TTTGATCGCA CTCCATACAG CATTCAGAAG TCTACAGGTC GGAGAAAGCT 3 00 
CTTGCGCCAT TGTCGGTGGA TCTAACCTCC ACATCACTCC AGATTCCTAC 350 
ATTTCATTCT CGACGATGAG GTAAGCACTA TCGTTTGCGA ATTACCTATC 400 
TTTGATTACG AGTGACTAAG TTGTACAGGC TCCTGTCGCC CCATGGACGA 450 
TCGTGCAGTC AATGGGTTTG GGCGCGGAGA GGGCACAAGT TGCATAATAC 500 
TGAAGCCTTT AGATGCCGCA TTGAAAGACC ACGATCCCAT AAGGGCAGTT 550 
ATTCGCAATA CGGGCACTAA TCAAGATGGG AAGACGACAG GTATCACGAT 600 
GCCGAATGGT GAAGCACAGG CCGCCTTAAT GCAATCAGTC TACGAGGCAG 650 
CGGGCTTAGA TCCCCTTGAA ACAGACTATG 680 



(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 09 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

Leu Leu Glu Val Ser Tyr Glu Ala Phe Glu Asn Ala Gly He Ser 

5 10 15 

Leu Ser Ser Val Ala Gly Thr Asp Val Gly Val Phe He Ser Ala 

20 25 30 

Ser Thr Asn Asp Tyr Arg Phe Val Phe His Asn Asp Leu Asp Thr 

35 40 45 

Leu Pro Met Phe Glu Ser Thr Gly Ser Glu Leu Ser He Met Ser 

50 55 60 

Asn Arg He Ser Tyr Thr Phe Asn Leu Arg Gly Pro Ser Met Thr 

65 70 75 
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Ile Asp Thr Pro Cys Ser Ser Sex Leu lie Ala Leu His Thr Ala 

80 85 90 

Phe Arg Ser Leu Gin Val Gly Glu Ser Ser Cys Ala He Val Gly 

95 100 105 

Gly Ser Asn Leu His He Thr Pro Asp Ser Tyr He Ser Phe Ser 

110 115 120 

Thr Met Ser Cys Thr Gly Ser Cys Arg Pro Met Asp Asp Arg Ala 

125 130 135 

Val Asn Gly Phe Gly Arg Gly Glu Gly Thr Ser Cys He He Leu 

140 145 150 

Lys Pro Leu Asp Ala Ala Leu Lys Asp His Asp Pro He Arg Ala 

155 160 165 

Val He Arg Asn Thr Gly Thr Asn Gin Asp Gly Lys Thr Thr Gly 

170 175 180 

He Thr Met Pro Asn Gly Glu Ala Gin Ala Ala Leu Met Gin Ser 

185 190 195 

Val Tyr Glu Ala Ala Gly Leu Asp Pro Leu Glu Thr Asp Tyr 
200 205 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 691 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

GCATTTGCTG GAGGTGAGCT ATGAAGCGCT TGAAAATGCT GGCCTTTCTC 50 

TTCCTTGCAT TGCCGGCACC AAAATGGGAG TCTTCGTTGG TGGAGGCAAT 100 

GCAKAGTATC GATCGCATAT CGGCCAAGAT ATTGACAATC TGCCTATGTT 150 

CGAGGCAACT GGTAACGCAG AGGCGCTATT GGCGAATAGA GTTTCTTATG 200 

TATATGATCT TCGAGGACCG AGTCTAACCA CCGATACCGC CTGTTCCTCA 250 

AGTCTCGCCG CTTTGAACAC GGCATTCTTA AGTCTACAGG CTGGCGAGTC 300 

GTCTACAGCA CTGGTCGGTA GCTCAGTAAT TCGGCTTAGG CCTGAGTCAG 350 

CCATCTCACT TTCCAGCATG CAGTAAGTCC TTCATGGTGC ACCTGCATAC 400 

ATTGCTAATA AGTGCAGGCT TCTATCCCCA GATGGAAAAT CTTACGCGTT . 450 

CGATGAGAGA GCTACCAGTG GTTTTGGAAG GGGTGAGGGT TCGGGTTGCA 500 

TAATACTAAA ACCCCTGGAC GCAGCCGTGA GAGACGGAGA CCCAATTAGA 550 

GCAGTCATTT GTAACTCGGG TGTAAACCAA GACGGCAAGA CTGCTGGTAT 600 

TACAATGCCT AATGGACACG CGCAAGCTTC TCTAATACGG TCTGTTTATC 650 

AGTCTACAGG GATAGACCCT TTAATGACGG ACTATGTCGA A 691 



(2) INFORMATION FOR SEQ ID NO: 40: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 215 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:40: 
Hi! Leu Leu Glu Val Ser Tyr Glu Ala Leu Glu Asn Ala Gly Leu 

5 10 

ser Leu Pro Cys He Ala Gly Thr Lys Met Gly Val Phe Val Gly 



20 



Gly Gly Asn Ala Xaa Tyr Arg Ser His He Gly Gin Asp He Asp 

35 40 

Asn Leu Pro Met Phe Glu Ala Thr Gly Asn Ala Glu Ala Leu Leu 

50 55 60 

Ala Asn Arg Val Ser Tyr Val Tyr Asp Leu Arg Gly Pro Ser Leu 

cci 70 75 



Thr Thr Asp Thr Ala Cys Ser Ser Ser Leu Ala Ala Leu Asn Thr 

85 



80 



Ala Phe Leu Ser Leu Gin Ala Gly Glu Ser Ser Thr Ala Leu Val 

95 100 105 

Gly Ser Ser Val He Arg Leu Arg Pro Glu Ser Ala He Ser Leu 

110 H5 120 

Ser Ser Met Gin Leu Leu Ser Pro Asp Gly Lys Ser Tyr Ala Phe 

125 130 135 

Asp Glu Arg Ala Thr Ser Gly Phe Gly Arg Gly Glu Gly Ser Gly 

- - - 145 1 b u 



140 



CVS He He Leu Lys Pro Leu Asp Ala Ala Val Arg Asp Gly Asp 
^ 160 lo5 



155 



Pro He Arg Ala Val He Cys Asn Ser Gly Val Asn Gin Asp Gly 

170 175 180 

Lvs Thr Ala Gly He Thr Met Pro Asn Gly His Ala Gin Ala Ser 

^ 185 190 195 

Leu He Arg Ser Val Tyr Gin Ser Thr Gly He Asp Pro Leu Met 

200 205 210 

Thr Asp Tyr Val Glu 
215 
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(2) INFORMATION FOR SEQ ID N0:41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 7 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:41: 
GCTGTTTCTT CAAACTAGCT GGCAATGCAT TGAAGATGCG GGATATAACC 50 
CCACATCCTT TGCAGGTAGC AAGTGTGGCG TATTTGTCGG CTGCGAAACG 100 
GGAGACTATG GAAAGATTGT GCAGCGATAT GAATTGAGCG CTCTCGGATT 150 
GCTAGGCTCT TCTGCGGCAC TGCTCCCGGC AAGGATCTCC TATTTCCTCA 2 00 
ACCTCCAGGG CCCTTGTATG GCGATCGACA CAGCCTGCTC TGCATCCCTA 250 
GTTGCCATAG CCAACGCCTG CGACAGCCTG GTACTGGGTC ACTCCGATGC 300 
AGCCTTGGCC GGAGGAGTCT ACGTCCTCTC CGGGCCGGAA ATGCACATTA 350 
TGATGAGCAA AGCTGGTATC TTGTCACCCG ATGGCAGATG TTTCACCTTC 400 
GATCGACGTG CTAACGGCTT TGTACCGGGC GAAGGTGTGG GCGTCGTGTT 450 
ACTCAAAGGC CTTGC-CGATG CCGAAAAAGA CGGTGATAAT ATCTGTGGTG 500 
TGATTCGAGG CTGGGGGGTG AATCAAGACG GCAAGACCAG TGGAATTACA 550 
GCACCTAACG GACAGTCACA GCAACGATTG CAGAAAGAAG TCTACGAACG 600 
GTTTCAGATT CAGCCAGCAG ACATTCAACT GGTTGAG 637 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 212 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
Leu Phe Leu Gin Thr Ser Trp Gin Cys lie Glu Asp Ala Gly Tyr 

5 10 15 

Asn Pro Thr Ser Phe Ala Gly Ser Lys Cys Gly Val Phe Val Gly 

20 25 30 

Cys Glu Thr Gly Asp Tyr Gly Lys lie Val Gin Arg Tyr Glu Leu 

35 40 45 

Ser Ala Leu Gly Leu Leu Gly Ser Ser Ala Ala Leu Leu Pro Ala 

50 55 60 

Arg lie Ser Tyr Phe Leu Asn Leu Gin Gly Pro Cys Met Ala lie 

65 70 75 



Asp Thr Ala Cys Ser Ala Ser Leu Val Ala He Ala Asn Ala Cys 

80 85 90 
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Asp Ser Leu Val Leu Gly His Ser Asp Ala Ala Leu Ala Gly Gly 

95 100 105 

Val Tyr Val Leu Ser Gly Pro Glu Met His He Met Met Ser Lys 
110 115 120 

Ala Gly He Leu Ser Pro Asp Gly Arg Cys Phe Thr Phe Asp Arg 
125 130 135 

Arg Ala Asn Gly Phe Val Pro Gly Glu Gly Val Gly Val Val Leu 
140 145 150 

Leu Lys Arg Leu Ala Asp Ala Glu Lys Asp Gly Asp Asn He Cys 
155 160 165 

Gly Val He Arg Gly Trp Gly Val Asn Gin Asp Gly Lys Thr Ser 
170 175 180 

Gly He Thr Ala Pro Asn Gly Gin Ser Gin Gin Arg Leu Gin Lys 
185 190 195 

Glu Val Tyr Glu Arg Phe Gin He Gin Pro Ala Asp He Gin Leu 
200 205 210 

Val Glu 

(2) INFORMATION FOR SEQ ID N0:43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 643 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
GATGATGATA GAAGTCGCTT ACCAAGGACT TGAGAGTGCA GGGCTGTCTC 50 
TTCAGGATGT TGCCGGATCG AGGACTGGAG TCTTCATTGG CCATTTCAGC 100 
AGTGATTACC GAGACATGAT ATTCAGAGAT CCCGAGAGGG CACCGACCTA 150 
CACTTTCAGT GGGGTTAGTA AGACGTCATT GGCGAATCGC ATCTCCTGGC 2 00 
TGTTCGACCT GAAAGGCCCA AGTTTCAGCT TGGACACAGC CTGCTCGTCG 2 50 
AGTCTGGTCG CCCTGCATTT GGCTTGCCAA AGCTTACGCG CTGGAGAGTC 3 00 
AGATATCGCC ATTGTCGGAG GGGTCAACCT TCTCTGGAAT CCGGAGTTGT 3 50 
TCATGTATCT CTCCAATCAG CACTTTCTCT CGCCAGATGG GAAATGTAAA 400 
AGCTTTGACG AATCCGGCGA TGGCTATGGT CGTGGCGAAG GCATTGCCGC 4 50 
TCTTGTACTA AGAAGAGTCG ACGACGCGAT TGCGGCCCGG GACCCTATTC 500 
GTGCCATCAT TCGCGGTACT GGGAGTAATC AGGACGGACA CACCAAAGGC 550 
TTCACCCTCC CCAGCGCAGA AGCCCAGGCG AGGTTGATTA GAGATACGTA 600 
CTCTGCCGCG GGGCTAGGTT TTAGAGACAC GCGATACGTA GAA 643 

(2) INFORMATION FOR SEQ ID NO:44: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 214 

(B) TYPE: amino acid 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:44: 

Met Met He Glu Val Ala Tyr Gin Gly Leu Glu Ser Ala Gly Leu 

5 10 15 

Ser Leu Gin Asp Val Ala Gly Ser Arg Thr Gly Val Phe He Gly 

20 25 30 

His Phe Ser Ser Asp Tyr Arg Asp Met He Phe Arg Asp Pro Glu 

35 40 45 

Arg Ala Pro Thr Tyr Thr Phe Ser Gly Val Ser Lys Thr Ser Leu 

50 55 60 

Ala Asn Arg He Ser Trp Leu Phe Asp Leu Lys Gly Pro Ser Phe 

65 70 75 

Ser Leu Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Leu His Leu 

80 85 90 

Ala Cys Gin Ser Leu Arg Ala Gly Glu Ser Asp He Ala He Val 

95 100 105 

Gly Gly Val Asn Leu Leu Trp Asn Pro Glu Leu Phe Met Tyr Leu 

110 115 120 

Ser Asn Gin His Phe Leu Ser Pro Asp Gly Lys Cys Lys Ser Phe 

125 130 135 

Asp Glu Ser Gly Asp Gly Tyr Gly Arg Gly Glu Gly He Ala Ala 

140 145 150 

Leu Val Leu Arg Arg Val Asp Asp Ala He Ala Ala Arg Asp Pro 

155 160 165 

He Arg Ala He He Arg Gly Thr Gly Ser Asn Gin Asp Gly His 

170 175 180 

Thr Lys Gly Phe Thr Leu Pro Ser Ala Glu Ala Gin Ala Arg Leu 

185 190 195 

He Arg Asp Thr Tyr Ser Ala Ala Gly Leu Gly Phe Arg Asp Thr 

200 205 210 

Arg Tyr Val Glu 

(2) INFORMATION FOR SEQ ID NO: 45: 
( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 655 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 
RGTCCTTATG GAGACCGTCT ACGAGGCAAT TGAGTCTGCG GGTATGACTT 50 
TGAAGGGGCT GCAAGGCAGC GACACAAGTG TGTATGCCGG CGTCATGTGT 100 
GGCGACTACG AGGCCATACA GCTCCGCGAT CTGGACGCGG CCCCGACTTA 150 
TTTCGCAGTG GGAACCTCGC GAGCTATCCT CTCCAATCGA ATCTCGTATT 200 
TCTTCAACTG GCACGGCGCG TCCATCACCA TGGACACGGC ATGTTCCTCT 250 
AGTCTGGTCG CCATTCACTT GGCCGTTCAG RCGCTTCGGG CAAATGAATC 300 
ACGRATGGCC GTGGCGTGTG GGTCGAACCT CATTCTCGGA CCCGAGAGTT 350 
ACATTATTGA AAGCAAGGTG AAGATGCTGT CCCCGGACGG TCTCAGCCGA 400 
ATGTGGGATA AAGACGCCAA CGGCTATGCG CGTGGAGATG GCGTTGCGGC 4 50 
CGTTGTTTTG AAGACTCTCA GCGCCGCGCT GGCGGACGGA GACCACATTG 500 
AATGTCTCAT ACGGGAGACG GGACTCAACC AGGACGGTGC GACAGCCGGT 550 
CTCACCATGC CTAGCGCCAC TGCGCAGCGA GCTCTTATTC ACAGTACGTA 600 
CACCAAGGCA GGTCTTGATC TCACTGCCCA GGCAGACCGT CCCCAGTATT 650 
TCGAG 

(2) INFORMATION FOR SEQ ID NO:46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 218 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:46: 
Val Leu Met Glu Thr Val Tyr Glu Ala He Glu Ser Ala Gly Met 

5 10 15 

Thr Leu Lys Gly Leu Gin Gly Ser Asp Thr Ser Val Tyr Ala Gly 
" ^ ~~ 25 30 



20 



Val Met Cys Gly Asp Tyr Glu Ala He Gin Leu Arg Asp Leu Asp 

35 40 45 

Ala Ala Pro Thr Tyr Phe Ala Val Gly Thr Ser Arg Ala He Leu 

50 55 60 

Ser Asn Arg He Ser Tyr Phe Phe Asn Trp His Gly Ala Ser He 

65 70 75 

Thr Met Asp Thr Ala Cys Ser Ser Ser Leu Val Ala He His Leu 

80 85 90 

Ala Val Gin Xaa Leu Arg Ala Asn Glu Ser Arg Met Ala Val Ala 

95 100 105 

Cys Gly Ser Asn Leu He Leu Gly Pro Glu Ser Tyr He He Glu 

110 115 120 



Ser Lys Val Lys Met Leu Ser Pro Asp Gly Leu Ser Arg Met Trp 
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125 130 135 

Asp Lys Asp Ala Asn Gly Tyr Ala Arg Gly Asp Gly Val Ala Ala 
140 145 150 

Val Val Leu Lys Thr Leu Ser Ala Ala Leu Ala Asp Gly Asp His 
155 160 165 

He Glu Cys Leu lie Arg Glu Thr Gly Leu Asn Gin Asp Gly Ala 
170 175 180 

Thr Ala Gly Leu Thr Met Pro Ser Ala Thr Ala Gin Arg Ala Leu 
185 190 195 

He His Ser Thr Tyr Thr Lys Ala Gly Leu Asp Leu Thr Ala Gin 
200 205 210 



Ala Asp Arg Pro Gin Tyr Phe Glu 
215 



(2) INTFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 754 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:47: 
AGGTCTGTTG GAGACGGTTT ATCGCGCCTT TGAAAACGGT AAGGCCACCC 50 
TGGGAAtAAA CCGGCTTCTC GTCCTGACGG CTTACTCTAT GCTAGCTGGT 100 
ATACCCATGG AGCAGGTCCT CGGGTCGAAG ACATCCGTTT ACGTGGGATG 150 
TTTCACCCGC GAGTTCGAGC AGTTGCTCGC GAGGGACCCC GAGATGAATC 200 
TGAAATACAT CGCTACGGGC ACCGGCACGG CGATGCTGTC GAATCGCCTC 250 
TCCTGGTTCT ATGACTTGAA AGGCGCCAGT ATCACTCTTG ATACTGCCTG 300 
TTCGTCCAGT CTCAATGCGT GCCATCTTGC TTGCGCAAGC TTACGTAATG 350 
GAGAAGCCAA TATGGTAAGA CTCCAACTCA TCGCGGGACT GAACAATTGC 4 00 
ATACTGATCC ATCAAAGGCC CTGGTAGGAG GCTGCAATCT TTTCTATAAC 450 
CCGGAAACGA TCATCCCTCT GACAAATCTA GGCTTTCTTT CTCCGGATAA 500 
CAAATGTTAT AGTTTTGACC ATCGTGCTAA CGGTTACTCT CGCGGCGAGG 550 
GGTTTGGTAT TCTTGTATTG AAGAGACTGT CGGACGCTCT ACGCGATAAC 600 
GACACTGTCC GTGCAGTGAT TCGGGCCTCT TCGTCTAACC AGGATGGCAA 650 
GTCTCCCGGT ATCACACAGC CTACCAAACA AGCGCAAATA CAACTGATCA 700 
AAGACACTTA CGCGGCTGCC GGGCTGGACT ATACGCAAAC CCGCTACTTC 750 
GANA 754 



(2) INFORMATION FOR SEQ ID NO: 48: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 214 

(B) TYPE: amino acid 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 
Gly Leu Leu Glu Thr Val Tyr Arg Ala Phe Glu Asn Ala Gly lie 

5 10 15 

Pro Met Glu Gin Val Leu Gly Ser Lys Thr Ser Val Tyr Val Gly 

20 25 30 

Cys Phe Thr Arg Glu Phe Glu Gin Leu Leu Ala Arg Asp Pro Glu 

35 40 45 

Met Asn Leu Lys Tyr lie Ala Thr Gly Thr Gly Thr Ala Met Leu 

50 55 60 

Ser Asn Arg Leu Ser Trp Phe Tyr Asp Leu Lys Gly Ala Ser lie 

65 70 75 

Thr Leu Asp Thr Ala Cys Ser Ser Ser Leu Asn Ala Cys His Leu 

80 85 90 

Ala Cys Ala Ser Leu Arg Asn Gly Glu Ala Asn Met Ala Leu Val 

95 100 105 

Gly Gly Cys Asn Leu Phe Tyr Asn Pro Glu Thr lie lie Pro Leu 
110 115 120 

Thr Asn Leu Gly Phe Leu Ser Pro Asp Asn Lys Cys Tyr Ser Phe 
125 130 135 

Asp His Arg Ala Asn Gly Tyr Ser Arg Gly Glu Gly Phe Gly lie 
140 145 150 

Leu Val Leu Lys Arg Leu Ser Asp Ala Leu Arg Asp Asn Asp Thr 
155 160 165 

Val Arg Ala Val lie Arg Ala Ser Ser Ser Asn Gin Asp Gly Lys 
170 175 180 

Ser Pro Gly lie Thr Gin Pro Thr Lys Gin Ala Gin lie Gin Leu 
185 190 195 

lie Lys Asp Thr Tyr Ala Ala Ala Gly Leu Asp Tyr Thr Gin Thr 
200 205 210 

Arg Tyr Phe Xaa 



(2) INFORMATION FOR SEQ ID NO: 49: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 722 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:49: 

CTTGTTACTC GAGACTGTCT ACGAATCTCT CGAGTCGGCT GGTCAGACAA 50 

TCGAAGGCTT GCAAGGATCG CAAACCGCAG TGTATATTGG TGTT^TGTGC 100 

GATGATTACG CCGAGCTCGT GTATCATGAT ACAGAGTCAA TCCCGACCTA 150 

TGCTGCAACT GGTAGTGCAC GCAGCATGAT GTCGAACCGA ATCTCTTACT 200 

TCTTTGACTG GAAGGGGCCG TCAATGACCA TTGATACTGC CTGTTCCTCT 250 

AGTCTTGTCG CTGTCCACCA GGCCGTTCAA GTTCTCAGGA GCGGAGAATC 300 

CCGCGTCGCA GTGGCTGCTG GGGCAAATCT CATCTTCGGA CCCAGTAAGT 350 

CTTCCTAAAA TATGAGTAGG CTCCAGTCAT TGTGATTGCT AATCACTTCA 400 

ACCATTTACA GAGATGTACA TTGCTGAGAG CAACCTCAAT ATGTTGTCCC 450 

CAACTGGSCG STCCCGAATG TGGGACGCTA ACSCGGATGG CTATGCACGA 500 

GGAGAGGGTA TTGCATCTGT CGTACTCAAA ACTCTTAGCT CTGCTATAGC 550 

AGATGGTGAT ACCATCGAAT GTTTGATCCG AGAAACCGGT GTCAACCAGG 600 

ATGGCCGCAC CACTGGTATC ACTATGCCAA GCTCCGCAGC CCAAGCCAGT 650 

TTGATCCGTC AGACTTACGC CAGAGCTGGT TTGGACCTGG CGAAGCAAGC 700 

TGATCGGCCT CAATTCTTTG AG 722 



(2) INFORMATION FOR SEQ ID NO:50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 218 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE : internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
Leu Leu Leu Glu Thr Val Tyr Glu Ser Leu Glu Ser Ala Gly Gin 

5 10 15 

Thr He Glu Gly Leu Gin Gly Ser Gin Thr Ala Val Tyr He Gly 

20 25 30 

Val Met Cys Asp Asp Tyr Ala Glu Leu Val Tyr His Asp Thr Glu 

35 40 45 

Ser He Pro Thr Tyr Ala Ala Thr Gly Ser Ala Arg Ser Met Met 

50 55 60 

Ser Asn Arg He Ser Tyr Phe Phe Asp Trp Lys Gly Pro Ser Met 

65 70 75 

Thr He Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Val His Gin 

80 85 ' 90 

Ala Val Gin Val Leu Arg Ser Gly Glu Ser Arg Val Ala Val Ala 

95 100 105 



Ala Gly Ala Asn Leu He Phe Gly Pro Lys Met Tyr He Ala Glu 
110 115 120 
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Ser Asn Leu Asn Met Leu Ser Pro Thr Gly Arg Ser Arg Met Trp 

125 130 135 

Asp Ala Asn Xaa Asp Gly Tyr Ala Arg Gly Glu Gly He Ala Ser 

140 145 150 

Val Val Leu Lys Thr Leu Ser Ser Ala He Ala Asp Gly Asp Thr 

155 160 165 

He Glu Cys Leu He Arg Glu Thr Gly Val Asn Gin Asp Gly Arg 

170 175 180 

Thr Thr Gly He Thr Met Pro Ser Ser Ala Ala Gin Ala Ser Leu 

185 190 195 

He Arg Gin Thr Tyr Ala Arg Ala Gly Leu Asp Leu Ala Lys Gin 

200 205 210 

Ala Asp Arg Pro Gin Phe Phe Glu 
215 



(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 703 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:51: 
AATATTACTT GAGACGATCT ACGAAGGACT TGAGTCCGCC GGACTTACCA 50 
TAAAGGGGCT GCAAGGTTCC CAAACAGCTG TGTACGTCGG TCTCATGGCT 100 
GGAGACTACT ATGACATCCA GATGCGCGAC ATAGAGACTT TGCCTCGATA 150 
TGCTGCTACC GGGACTGCTC GTAGCATTAT GAGCAACCGA GTCTCTTATT 2 00 
TCTTTGATTG GAAAGGTCCG TCCATGACAA TTGATACGGC CTGCTCTTCT 250 
TCCCTCGTTG CCGTTCATCA GGCTGTCGAG ATTCTCCGGA GAGGTGATGT 3 00 
TACCATGGCT GTGGCTGCCG GCGCCAACCT GATCTATGGT CCTGAGGCTT 3 50 
ATATATCCGA GTCGAATCTG AACATGCTGT CGCCGAGCGG AAGATCGCGC 400 
ATGTGGGATT CAAGTGCGGA CGGATACGGC CGCGGAGAAG GGTTTGCGGC 450 
AGTGATGTTG AAGACCCTGA GCGCTGCAAT TCGTGATGGA GATCATATCG 500 
AGTGCATTAT CCGGGAGACA GGAATTAACC AGGATGGCAG AACAGCCGGA 550 
ATTACCATGC CAAGTGCTGT CAGCCAGACT CGATTGATCA AAGACACATA 600 
TGCTCGAGCT GGACTCGATT GCAGGAAAGA AGCGGAGAGA TGCCAGTACT 650 
TTGAAGGTAA GCGAATAACT TTTCTTGATA AACGCACTTA CTAAGATCTT 700 
TAA 703 



(2) INFORMATION FOR SEQ ID NO: 52: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 234 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

He Leu Leu Glu Thr He Tyr Glu Gly Leu Glu Ser Ala Gly Leu 

5 10 15 

Thr He Lys Gly Leu Gin Gly Ser Gin Thr Ala Val Tyr Val Gly 

20 25 30 

Leu Met Ala Gly Asp Tyr Tyr Asp He Gin Met Arg Asp He Glu 

35 40 45 

Thr Leu Pro Arg Tyr Ala Ala Thr Gly Thr Ala Arg Ser He Met 

50 55 60 

Ser Asn Arg Val Ser Tyr Phe Phe Asp Trp Lys Gly Pro Ser Met 

65 70 75 

Thr He Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Val His Gin 

80 85 90 

Ala Val Glu He Leu Arg Arg Gly Asp Val Thr Met Ala Val Ala 

65 70 75 

Ala Gly Ala Asn Leu He Tyr Gly Pro Glu Ala Tyr He Ser Glu 

110 115 120 

Ser Asn Leu Asn Met Leu Ser Pro Ser Gly Arg Ser Arg Met Trp 

125 130 135 

Asp Ser Ser Ala Asp Gly Tyr Gly Arg Gly Glu Gly Phe Ala Ala 

140 145 150 

Val Met Leu Lys Thr Leu Ser Ala Ala He Arg Asp Gly Asp His 

155 160 165 

He Glu Cys He He Arg Glu Thr Gly He Asn Gin Asp Gly Arg 

170 175 180 

Thr Ala Gly He Thr Met Pro Ser Ala Val Ser Gin Thr Arg Leu 

185 190 195 

He Lys Asp Thr Tyr Ala Arg Ala Gly Leu Asp Cys Arg Lys Glu 

200 205 210 

Ala Glu Arg Cys Gin Tyr Phe Glu Gly Lys Arg He Thr Phe Leu 

215 220 225 

Asp Lys Arg Thr Tyr Xaa Asp Leu Xaa 
230 

(2) INFORMATION FOR SEQ ID NO: S3: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 643 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:53: 
GCTGTTGCTG GAGGTAAGTT GGGAAGCTTT AGAAAATGCT GGCAAAGCAC 50 
CTGAAAAGCT AGCAGGAAGC AATACAGGTG TATTTGTTGG CATTAGCAAC 100 
TTTGATTATT CACAGTTGCA AATTAATCAA ACCGCTCAAC TAGATGCCTA 150 
TACAGGCACT GGCAATGCTT TTAGCATCGC AGCTAACCGT CTTTCCTATT 200 
TTCTAGACTT GCACGGACCT AGCTGGGCAG TAGACACAGC CTGTTCATCA 250 
TCTCTAGTAG CAGTCCATCA AGCTTGCCAA AGTCTGCGTC AAGGAGAATG 300 
CGAACTAGCC CTCGCTGGTG GTGTAAATCT GATTCTCACC CCACAATTAA 350 
CCATCACTTT TTCCCAAGCT GGGATGATGG CTGCTGATGG TCGTTGCAAA 400 
ACCTTTGATG CTGATGCTGA TGGTTACGTG CGGGGCGAAG GTTGTGGTGT 450 
TGTAATTCTC AAGCGTTTGG CCAACGCTCA ACGAGATGGA GACAATATTT 500 
TGGCAGTTAT TAT^GGTTCG GCAGTTAACC AAGATGGTCG CAGCAACGGA 550 
TTGACAGCAC CCAACGGTCA TGCCCAACAA GCAGTTATTC GCCAAGCATT 600 
ACAAAATGCC AATGTTGCAG CTGCCGAGAT TAGCTATGTA GAA 643 



(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 214 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
Leu Leu Leu Glu Val Ser Trp Glu Ala Leu Glu Asn Ala Gly Lys 

5 10 15 

Ala Pro Glu Lys Leu Ala Gly Ser Asn Thr Gly Val Phe Val Gly 

20 25 30 

He Ser Asn Phe Asp Tyr Ser Gin Leu Gin He Asn Gin Thr Ala 

35 40 45 

Gin Leu Asp Ala Tyr Thr Gly Thr Gly Asn Ala Phe Ser He Ala 

50 55 60 

Ala Asn Arg Leu Ser Tyr Phe Leu Asp Leu His Gly Pro Ser Trp 

65 70 75 

Ala Val Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Val His Gin 

80 85 90 

Ala Cys Gin Ser Leu Arg Gin Gly Glu Cys Glu Leu Ala Leu Ala 

95 100 105 
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Glv Gly Val Asn Leu He Leu Thr Pro Gin Leu Thr He Thr Phe 

110 115 120 

Ser Gin Ala Gly Met Met Ala Ala Asp Gly Arg Cys Lys Thr Phe 
125 130 

Asp Ala Asp Ala Asp Gly Tyr Val Arg Gly Glu Gly Cys Gly Val 
140 145 150 

Val He Leu Lys Arg Leu Ala Asn Ala Gin Arg Asp Gly Asp Asn 
155 160 165 

He Leu Ala Val He Lys Gly Ser Ala Val Asn Gin Asp Gly Arg 
170 175 180 

Ser Asn Gly Leu Thr Ala Pro Asn Gly His Ala Gin Gin Ala Val 
185 190 195 

He Arq Gin Ala Leu Gin Asn Ala Asn Val Ala Ala Ala Glu He 
200 205 210 

Ser Tyr Val Glu 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 655 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

TCTTTTTTTG GAGTGTGCTT GGGAAGCGCT GGAAAATGCT GGTTATGACC 50 
CGAAAACAGA CAAAAATCTA ATTGGCGTTT ATGCAGGGGG GAATCTAAGT 100 
ACCTACTTAC TTAACAATCT CGCCTCACAC CCTGAACTCA TTAAAGCGCT 150 
GGAGTCACAA ATTACAATTG CTAATGATAA GGACTTTATA TGCACACGAG 200 
TTTCTTACAA ATTAAACCTG AAAGGGCCGA GTATTAGTGT CGGCACGGCC 250 
TGCTCTACGT CATTAGTAGC AGTTCACTTG GCATGTCGAG GATTGCTAAG 300 
TTACCAGTGT GATATGGCAC TGGCTGGCGG TATTGCGATA CAAGTTCCAC 350 
AAAAACAAGG TTATTTCTAT CAAGAAGGTG GCATGGCCTC TCCTGATGGC 400 
CACTGTCGGG CCTTTGATGC TAAAGCACAA GGTAGCCCTT TTGGCAAAGG 450 
AGCAGGTATT GTCGTGCTGA AAAGATTGGA AGATGCTGTA GCTGATGGAG 500 
ACTGCATTTA TGCGGTTATC AAAGGTTCAG CCATCAATAA CGACGGTTCC 550 
GAGAAGGTGA GTTACACCGC ACCCAGTGTA ACAGGCCAAG CAGAAGTGAT 600 
TGCCGAGGCT CAGGCGATCG CTAACTTTGA TTCTGAAACA ATCACCTACA 650 
TTGAA 



(2) INFORMATION FOR SEQ ID NO: 56: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 217 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 
Leu Phe Leu Glu Cys Ala Trp Glu Ala Leu Glu Asn Ala Gly Tyr 

5 10 15 

Asp Pro Lys Thr Asp Lys Asn Leu He Gly Val Tyr Ala Gly Gly 

20 25 30 

Asn Leu Ser Thr Tyr Leu Leu Asn Asn Leu Ala Ser His Pro Glu 

35 40 45 

Leu He Lys Ala Leu Glu Ser Gin He Thr He Ala Asn Asp Lys 

50 55 60 

Asp Phe He Cys Thr Arg Val Ser Tyr Lys Leu Asn Leu Lys Gly 

65 70 75 

Pro Ser He Ser Val Gly Thr Ala Cys Ser Thr Ser Leu Val Ala 

80 85 90 

Val His Leu Ala Cys Arg Gly Leu Leu Ser Tyr Gin Cys Asp Met 

95 100 105 

Ala Leu Ala Gly Gly He Ala He Gin Val Pro Gin Lys Gin Gly 
110 115 120 

Tyr Phe Tyr Gin Glu Gly Gly Met Ala Ser Pro Asp Gly His Cys 
125 130 135 

Arg Ala Phe Asp Ala Lys Ala Gin Gly Ser Pro Phe Gly Lys Gly 
140 145 150 

Ala Gly He Val Val Leu Lys Arg Leu Glu Asp Ala Val Ala Asp 
155 160 165 

Gly Asp Cys He Tyr Ala Val He Lys Gly Ser Ala He Asn Asn 
170 175 180 

Asp Gly Ser Glu Lys Val Ser Tyr Thr Ala Pro Ser Val Thr Gly 
185 190 195 

Gin Ala Glu Val He Ala Glu Ala Gin Ala He Ala Asn Phe Asp 
200 205 210 



Ser Glu Thr He Thr Tyr He 
215 



(2) INFORMATION FOR SEQ ID NO: 57: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 765 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:57: 
ATTGCTGCTT GAAAACGTCT ATGAAGCTCT TGAAAACGGT GAGCGGTTCT 50 
TCAAGAGAAT ATTGATGCAT CAATATGCTA ACTTGATGTC AATCATCAGC 100 
TGGTATTCCT CTGAGCGAGT CCGTCTCTTC TAACACCTCC GTTTATGTTG 150 
GCTCATTCGG TGATGACTAT AAGACGATTC TCAATACCGA TTTTGAGAGT 200 
TGGGTCAAGT ACAAAGGCAC CGGTGTCTAT AACTCGATTC TGGCCAATCG 2 50 
AATCAGCTGG TTCTACGACT TTAAAGGAGC CAGCGTCACG CTAGATACCG 3 00 
CATGCTCGAG TAGCTTGGTA GCCGTGCATA TGGCTTGCCA GGATTTGAGG 3 50 
TTGGGAGAGT CTAGAATGGT CAGTGTATTT CTCTATTGAA AAGTACTAGA 400 
GGATTCTAAT TGACGTATTT GGATACCAGT CCGTTGTCGG CGGTGTCAAC 4 50 
ATCATTGGCC ATCCGTTGCT CGTCCACGAT CTAAGCAAGC TCGGAGCGCT 500 
CTCTCCTGAT GGCGTGTGCT ACACTTTCGA TGAACGGGCC AATGGATATT 550 
CCCGGGGAGA AGGTGTCGGC ACCATCGTTC TCAAACGGCT CTCTGACGCA 600 
ATCGAAGATG GTGATACCAT TCGCGCTATC ATCCGTGCAA GCGGGTGCAA 650 
TCAAGACGGT AAAACAGCAG GTATATTTGT CCCTTCAGTC CAAGCCCAGG 70 0 
AGCGACTTAT CCGGGATACC TATGAGAAGG CTGGGCTTGA CCGGACACGC 750 
ACGACATATT TGGAA 765 



(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 216 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 
Leu Leu Leu Glu Asn Val Tyr Glu Ala Leu Glu Asn Ala Gly lie 

5 10 15 

Pro Leu Ser Glu Ser Val Ser Ser Asn Thr Ser Val Tyr Val Gly 

20 25 30 

Ser Phe Gly Asp Asp Tyr Lys Thr He Leu Asn Thr Asp Phe Glu 

35 40 45 

Ser Trp Val Lys Tyr Lys Gly Thr Gly Val Tyr Asn Ser He Leu 

50 55 60 

Ala Asn Arg He Ser Trp Phe Tyr Asp Phe Lys Gly Ala Ser Val 

65 70 75 



Thr Leu Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Val His Met 

80 85 90 
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Ala Cys Gin Asp Leu Arg Leu Gly Glu Ser Arg Met Val Ser Ser 

Val val Gly Gly Val Asn lie He Gly His Pro Leu Leu Val His 
110 

ASP Leu ser Lys Leu Gly Ala Leu Ser Pro Asp Gly Val Cys Tyr 

^ 130 -^-^-^ 



125 



Thr Phe Asp Glu Arg Ala Asn Gly Tyr Ser Arg Gly Glu Gly Val 
140 145 

Gly Thr He Val Leu Lys Arg Leu Ser Asp Ala He Glu Asp Gly 

Asp Thr lie Arg Ala He He Arg Ala Ser Gly Cys Asn Gin Asp 
170 175 180 

Gly Lys Thr Ala Gly He Phe Val Pro Ser Val Gin Ala Gin Glu 



185 

Arg Leu He Arg Asp Thr Tyr Glu Lys Ala Gly Leu Asp Arg Thr 
200 205 210 

Arg Thr Thr Tyr Leu Glu 
215 

(2) INFORMATION FOR SEQ ID NO: 59: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 709 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: HO 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 
TAAGTTACTG GAAACAGCAT ATACTGCGTT TGAGAACGGT GAGTACGCCT 50 
TGCGTCGTAT CCCCTCCCCC CTCATGGAAG ATCTCAATCT GATCTCGTGA 100 
ScAGCCGGC ATCGGGTTAG AAGCGGCACG AGGATCAAAC ACTTCAGTAC 150 
ATATAGGTTG TTTTAATATC GACTATACAA GCAACCATAG TAGAGATCCA 2 00 
SaGCAGATGC ACAAATATAC GGGGACTGGA GGAGCACCTT CCATGCTGTC 250 
G^CAGACTG AGTTGGTTTT TCGATCTGAG AGGACCGAGC TTGACCTTGG 3 00 
aCACGGCATG CTCTAGTAGC ATGGTTGCGC TTGATTTAGC ATGCCAGACT 350 
?SS?Jg?G gISJ??TGA StGGGTCTT GTCGGGGGTT GTAATCTCAT 400 
SaCAGCGTC GACATGACCA TGGCTCTATC CAAGCTTGGA TTTCTCTCCC 450 
ATAACAGTCG GTGCTACAGT TTTGACCATC GAGCGGATGG GTACGCCAGA 500 
GGTGAAGGCT TTGGAGTTTT AATTCTCAAA CGTGTCGAAG ACGCCATACG 550 
aPATrrrCAT ACTATACGAG GAGTCATTCG ATTAACAAGC TCCAATCAAG 600 
A?GG??a?a? TC?SgSS?a ACAATGCCCA GCAGAGACGC CCAAGCAAGT 650 
T?GATTAGAA AGACATACCA ACAAGCTGGA TTAGATATGC AGATGACAGG 700 
CTACTTTGA 

(2) INFORMATION FOR SEQ ID NO: 60: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 213 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

Lys Leu Leu Glu Thr Ala Tyr Thr Ala Phe Glu Asn Ala Gly lie 

5 10 15 

Gly Leu Glu Ala Ala Arg Gly Ser Asn Thr Ser Val His He Gly 

20 25 30 

Cys Phe Asn He Asp Tyr Thr Ser Asn His Ser Arg Asp Pro Glu 

35 40 45 

Gin Met His Lys Tyr Thr Gly Thr Gly Gly Ala Pro Ser Met Leu 

50 55 60 

Ser Asn Arg Leu Ser Trp Phe Phe Asp Leu Arg Gly Pro Ser Leu 

65 70 75 

Thr Leu Asp Thr Ala Cys Ser Ser Ser Met Val Ala Leu Asp Leu 

80 85 90 

Ala Cys Gin Thr Leu Gin Ser Gly Gin Ser Asp Met Gly Leu Val 

95 100 105 

Gly Gly Cys Asn Leu He Tyr Ser Val Asp Met Thr Met Ala Leu 

110 115 120 

Ser Lys Leu Gly Phe Leu Ser His Asn Ser Arg Cys Tyr Ser Phe 

125 130 135 

Asp His Arg Ala Asp Gly Tyr Ala Arg Gly Glu Gly Phe Gly Val 

140 145 150 

Leu He Leu Lys Arg Val Glu Asp Ala He Arg Asp Gly Asp Thr 

155 160 165 

He Arg Gly Val He Arg Leu Thr Ser Ser Asn Gin Asp Gly His 

170 175 180 

Thr Pro Gly He Thr Met Pro Ser Arg Asp Ala Gin Ala Ser Leu 

185 190 195 

He Arg Lys Thr Tyr Gin Gin Ala Gly Leu Asp Met Gin Met Thr 

200 205 210 

Gly Tyr Phe 

(2) INFORMATION FOR SEQ ID NO: 61: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 64 9 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 
AATGTTGCTC GAGATCACCT ACGAAGCCCT GGAGAACGCT GGACTTCCTT 50 
TGAGTAAGGT TGTCGGCTCT GATACAGCCT GCTTCATTGG TGGCTTTACA 100 
CGAGATTATG ATGATTTGAC CACTTCGGAG CTCGCGAAGA CCCTACTCTA 150 
CACAACTACC GGCAACGGCC TGACGATGAT GTCGAATCGC TTATCCTGGT 2 00 
TCTACGACCT TCATGGCCCG TCGGXTTCGC TCGACACAGC ATGTTCTAGC 2 50 
TCGCTGGTTG CACTAAACCT TGCATGCCAG ACAATCCGAG CATCGACGAA 3 00 
TGACTCTCGA CAGGCGATAG TTGGAGGTGT CAATCTCATG CTGCTCCCTG 350 
ATCAGATGAC CACGATTAAT CCTCTGCATT TCTTAAGTCC TGATAGCCAA 4 00 
TGCTACTCGT TTGATGACCG TGCAAACGGT TACACCCGTG GAGAAGGTAT 450 
TGGCATACTG GTGCTCAAGC ACATCAATGA TGCTATTCGA GATGGAGACT 500 
GTATAAGGGC AGTAATCCGC GGCACTGGGG TCAACTCCGA TGGCAAGACC 550 
CCTGGCATTA CCTTGCGAAG CACGGCTGCA - CAAGCCTCTT TAATTCGCGC 600 
AACGTACGCC TCGGCAGGGC TGGACCCAGC TCACACCGGC TACTTTGAA 64 9 

(2) INFORMATION FOR SEQ ID NO:62: 
(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 216 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 
Met Leu Leu Glu He Thr Tyr Glu Ala Leu Glu Asn Ala Gly Leu 

5 10 15 

Pro Leu Ser Lys Val Val Gly Ser Asp Thr Ala Cys Phe He Gly 

20 25 30 

Glv Phe Thr Arq Asp Tyr Asp Asp Leu Thr Thr Ser Glu Leu Ala 
^ 35 40 45 

Lvs Thr Leu Leu Tyr Thr Thr Thr Gly Asn Gly Leu Thr Met Met 

50 55 60 

Ser Asn Arg Leu Ser Trp Phe Tyr Asp Leu His Gly Pro Ser Val 

65 70 75 

Ser Leu Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Leu Asn Leu 

80 85 90 

Ala Cys Gin Thr He Arg Ala Ser Thr Asn Asp Ser Arg Gin Ala 

95 100 105 

He Val Gly Gly Val Asn Leu Met Leu Leu Pro Asp Gin Met Thr 
110 115 120 
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Thr He Asn Pro Leu His Phe Leu Ser Pro Asp Ser Gin Cys Tyr 
125 130 135 

Ser Phe Asp Asp Arg Ala Asn Gly Tyr Thr Arg Gly Glu Gly He 
140 145 150 

Gly He Leu Val Leu Lys His He Asn Asp Ala He Arg Asp Gly 
155 160 165 

Asp Cys He Arg Ala Val He Arg Gly Thr Gly Val Asn Ser Asp 
170 175 180 

Gly Lys Thr Pro Gly He Thr Leu Pro Ser Thr Ala Ala Gin Ala 
185 190 195 

Ser Leu He Arg Ala Thr Tyr Ala Ser Ala Gly Leu Asp Pro Ala 
200 205 210 

His Thr Gly Tyr Phe Glu 
215 



(2) INFORMATION FOR SEQ ID N0:63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 747 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 
TATGCTACTT GAATGCACAT ACGAAGCGTT AGAGAATGGT CAGTGAGCTA 50 
CGAGCCGATT TTCATATATC ATGGCTAACA AGTTGAAGCT GGCATACCTC 100 
TAGATAAAGT AGTAGGAGAA CCCGTAGGGG TGTACGTCGG CTCAGCTAGT 150 
TCCGATTACT CGGACATCGT GAACTCAGAC GGCGAGATGG TCTCCACTTA 200 
CACGGCCACG GGGTTGGCCG CAACGATGAT GGCAAACCGC ATATCCTATT 250 
TCTATGATCT CCGGGGGCCA AGCTTCACAT TGGACACGGC GTGTTCATCG 3 00 
AGTTTGATGG CGTTACACCT AGCGTGCCAA AGTCTTCGAG TCGGTGAATC 3 50 
GAAGCAAGCC ATTGTGGGCG GGGTCCACCT TGTACTGAGC CCGGATTGTA 4 00 
TGACTTCGAT GAGTTTATTA GGGTAAGACC TTCAAAATCT CCATGCAGAA 450 
TTTCTAAATC TAACCTACCA CCCTAGTTTG TTCTCTAATG ACGGCCGATC 500 
CTACACTTAT GACCATCGAG GTACTGGTTA TGGGCGCGGC GAAGGTATTG 550 
CTACCTTAGT AATAAAACCT CTTAAAGATG CGATGGAAGC CGGTGATAAC 600 
ATCCGGGCCA TCATCCGCAA TAGTGGGGCA AATCAAGATG GTCGAACACC 650 
AGGTGTGACT TTTCCAAGTC AAGATGCTCA GATAGATCTT ATGAGATCGG 700 
TATATCGTTC CGCTGGACTT GATGTACTTG ATACCGGCTA CGTGGAA 747 



(2) INFORMATION FOR SEQ ID NO: 64: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 214 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:64: 
Met Leu Leu Glu Cys Thr Tyr Glu Ala Leu Glu Asn Ala Gly lie 

5 10 15 

Pro Leu Asp Lys Val Val Gly Glu Pro Val Gly Val Tyr Val Gly 

20 25 30 

Ser Ala Ser Ser Asp Tyr Ser Asp He Val Asn Ser Asp Gly Glu 

35 40 45 

Val Ser Thr Tyr Thr Ala Thr Gly Leu Ala Ala Thr Met Met 

50 55 60 

Ala Asn Arg He Ser Tyr Phe Tyr Asp Leu Arg Gly Pro Ser Phe 

65 70 75 

Thr Leu Asp Thr Ala Cys Ser Ser Ser Leu Met Ala Leu His Leu 

80 85 90 

Ala Cys Gin Ser Leu Arg Val Gly Glu Ser Lys Gin Ala He Val 

95 100 105 

Glv Glv Val His Leu Val Leu Ser Pro Asp Cys Met Thr Ser Met 
' 110 115 120 

Ser Leu Leu Gly Leu Phe Ser Asn Asp Gly Arg Ser Tyr Thr Tyr 
125 130 135 

Xaa His Arg Gly Thr Gly Tyr Gly Arg Gly Xaa Gly He Ala Thr 
140 145 150 

Leu Val He Lys Pro Leu Lys Asp Ala Met Glu Ala Gly Asp Asn 
155 160 165 

He Arq Ala He He Arg Asn Ser Gly Ala Asn Gin Asp Gly Arg 
170 175 180 

Thr Pro Gly Val Thr Phe Pro Ser Gin Asp Ala Gin He Asp Leu 
185 190 195 

Met Arq Ser Val Tyr Arg Ser Ala Gly Leu Asp Val Leu Asp Thr 
200 205 210 

Gly Tyr Val Glu 

(2) INFORMATION FOR SEQ ID NO: 65: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 643 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65 

AATTCTACTT GAAGTCGCCT ATCAAGCAAT GGAGTCAAGC GGCTGCTTAC 50 

GGAACCATCG ACGCGAAGCT GGGGATCCTG TGGGATGTTT TATTGGAGCT 100 

AGCTTTGCCG AATATCTTGA CAACACCTGT TCTT^TCCGC CAACCAGCTA 150 

TACTTCCACT GGCACCATCA GAGCTTTCCA CTGCGGTAGA CTCAGTTATT 2 00 

ACTTTGGATG GAGCGGTCCT GCCGAGGTCA TTGATACAGC TTGCTCCTCT 2 50 

TCGTTGGTTG CTATCAATCG AGCTTGCAAG TCAGTGCAGG CGGGTGAATG 3 00 

TACAATGGCT CTTACTGGTG GAGTGAACAT TATAACTGGT ATCCACAACT 3 50 

TCTTAGATCT GGCAAAGGCT GGCTTYTTAA GCCCCACAGG CCAATGCAGA 4 00 

CCCTTTGACC AGTCTGCAGA TGGGTATTGT CGCTCAGAAG GAGCAGGACT 4 50 

TGTTGTACTA AAACTGTTAA GCCAAGCCAT AGCAGATGGA GATCAAATTT 50 0 

TCGGAGTTAT TCCAAGTGTG TCCACCAACC' AAGGCGGATT GTCATCTTCA 550 

ATTACGATTC CTCATTCGCC TGCACAAAAA AAGTTGTATC AAACCGTGCT 600 

TCGGCAAGCC GGCATGAAGC TAGAACAGGT TAGCTACGTA GAG 643 



(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: " 

(A) LENGTH: 214 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
lie Leu Leu Glu Val Ala Tyr Gin Ala Met Glu Ser Ser Gly Cys 

5 10 15 



Leu Arg Asn His Arg Arg Glu Ala Gly Asp Pro Val Gly Cys Phe 

20 25 30 

lie Gly Ala Ser Phe Ala Glu Tyr Leu Asp Asn Thr Cys Ser Asn 

35 40 45 

Pro Pro Thr Ser Tyr Thr Ser Thr Gly Thr lie Arg Ala Phe His 

50 55 60 

Cys Gly Arg Leu Ser Tyr Tyr Phe Gly Trp Ser Gly Pro Ala Glu 

65 70 .75 

Val lie Asp Thr Ala Cys Ser Ser Ser Leu Val Ala lie Asn Arg 

80 85 90 

Ala Cys Lys Ser Val Gin Ala Gly Glu Cys Thr Met Ala Leu Thr 

95 100 105 

Gly Gly Val Asn lie lie Thr Gly lie His Asn Phe Leu Asp Leu 
110 115 120 



Ala Lys Ala Gly Phe Leu Ser Pro Thr Gly Gin Cys Arg Pro Phe 
125 130 135 
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Asp Gin Ser Ala Asp Gly Tyr Cys Arg Ser Glu Gly Ala Gly Leu 
140 145 150 

Val Val Leu Lys Leu Leu Ser Gin Ala lie Ala Asp Gly Asp Gin 
155 160 165 

He Phe Gly Val He Pro Ser Val Ser Thr Asn Gin Gly Gly Leu 
170 175 180 

Ser Ser Ser He Thr He Pro His Ser Pro Ala Gin Lys Lys Leu 
185 190 195 

Tyr Gin Thr Val Leu Arg Gin Ala Gly Met Lys Leu Glu Gin Val 
200 205 210 

Ser Tyr Val Glu 



(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 809 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 
AGGAAACTAC TAGAGGTCGT GTTTGAATGT TTTGAGAGTG CCGGTACACC 50 
ACTTCACGCA GTTTCAGGAG CTAATATTGG CTGCTATGTT GGGAATTTTA 100 
CGTTGGATTA TCTTGTCATG CAGTCTAAGG ATACAGACTC TTTTCATCGA 150 
TATACTGCTC CAGGAATGGG ACCTACATTG TTAGCTAACC GCATAAGTCA 200 
TGTTTTTAAT CTTCAAGGTC CAAGTGTTAT GCTTGATACA GCGTGTTCTT 250 
CATCGATCTA CGCTCTTCAT GCAGCTTGTG TGGCCTTGAA TGCAGATGAG 300 
TGCAATGCAG CAATTGTTGC TGGGGCAAAC CTAATCCAGT CACCTGAGTG 350 
GCATCTTGCA GTCTCCAAAT CAGGTGTGAT TTCACAAACT TCCACGTGTC 400 
ACACTTTCGA TGCTAGTGCG GATGGTTATG GGCGAGGCGA GGGCGTTGGG 450 
GCCCTCTATC TCAAGCGTCT AAGTGACGCA ATCCGAGATC GAGATCCTAT 500 
ACGGTCTGTT ATTCGTGGTA CAGCTGTTAA TAGGTTAGTA CATCCTCTTA 550 
CCTTTCTTTC ATGGATTAGC GAGAATTAGG GTTCCAAATG TTTGAAAGCT 600 
CGGGTTCTAA TATTCATTCA CTGGACTAGT AATGGCAAGA CAAACGGCAT 650 
CAGTCAGCCT AGTGCTTTGG CACAGGAAGC TGTGATTAAA AAAGCTTATG 700 
CAAAGGCGGG ATTACCTGTT ACCGAGACTG ACTATGTTGA GGTAAGTGAG 750 
CTATGTTTAA ATCAGAAAAC GTCATGCCAT TATTTCTTAT CCTTCACTGA 800 
NCTCTTACA 809 



(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 237 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 
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(iii) HYPOTHETICAL: no 
(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:68: 
Arg Lys Leu Leu Glu Val Val Phe Glu Cys Phe Glu Ser Ala Gly 
^ 5 10 

Thr Pro Leu His Ala Val Ser Gly Ala Asn He Gly Cys Tyr Val 

20 25 

Gly Asn Phe Thr Leu Asp Tyr Leu Val Met Gin Ser Lys Asp Thr 

35 40 *=> 

Asp Ser Phe His Arg Tyr Thr Ala Pro Gly Met Gly Pro Thr Leu 



50 



Leu Ala Asn Arg He Ser His Val Phe Asn Leu Gin Gly Pro Ser 

65 70 75 

val Met Leu Asp Thr Ala Cys Ser Ser Ser He Tyr Ala Leu His 

80 85 

Ala Ala Cys Val Ala Leu Asn Ala Asp Glu Cys Asn Ala Ala He 

95 100 105 

Val Ala Gly Ala Asn Leu He Gin Ser Pro Glu Trp His Leu Ala 
110 115 1^0 

Val Ser Lys Ser Gly Val He Ser Gin Thr Ser Thr Cys His Thr 
125 130 135 

Phe Asp Ala Ser Ala Asp Gly Tyr Gly Arg Gly Glu Gly Val Gly 
140 145 

Ala Leu Tyr Leu Lys Arg Leu Ser Asp Ala He Arg Asp Arg Asp 
155 160 1°5 

Pro He Arg Ser Val He Arg Gly Thr Ala Val Asn Ser Asn Gly 
170 175 I'*" 

Lys Thr Asn Gly He Ser Gin Pro Ser Ala Leu Ala Gin Glu Ala 
185 190 195 

Val He Lys Lys Ala Tyr Ala Lys Ala Gly Leu Pro Val Thr Glu 
200 205 210 

Thr Asp Tyr Val Glu Val Ser Glu Leu Cys Leu Asn Gin Lys Thr 
215 220 225 

Ser Cys His Tyr Phe Leu Ser Phe Thr Xaa Leu Leu 
230 235 

(2) INFORMATION FOR SEQ ID NO: 69: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 658 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

TTTGCTCCTT GAGACTGTCT ACGAAGCTCT GGAAGCAGGC GGTCACACGA 50 

TTGAAGCGCT ACGAGGATCT GATACGTCTG TCTTTACAGG CACCATGGGC 100 

GTCGACTACA ACGATACTGT TATACGGGAC CTGAACGTCA TCCCGACGTA 150 

CTTTGCTACT GGAGTAAATC GAGCTATCAT CTCGAACCGA GTCTCATACT 200 

TCTTTGACTG GCATGGGCCG AGCATGACCA TCGACACAGC CTGTTCATCC 250 

AGTCTCGTCG CCGTGCACCA AGGAGTGAAA GCTCTTCGGA GTGGGGAGTC 300 

GCGTACTGCC CTGGCATGTG GGACGCAGGT CATTCTAAAT CCCGAGATGT 350 

ATGTTATTGA GAGCAAGCTG AAAATGCTTT CTCCTACGGG CCGCTCCCGC 400 

ATGTGGGATG CGGACGCGGA TGGCTACGCT CGTGGGGAGG GCGTAGCGGC 4 50 

TGTAGTGCTG AAACGGCTCA GTGACGCTAT TGCGGATGGA SATCGCATCG 500 

AGTGCATCAT CCGTGAGACA GGGTCCAACC AAGACGGCCA TTCAAATGGT 550 

ATCACGGTGC CGAGTACGGA GGCCCAAGCG GCCCTCATCC ACCAAACCTA 600 

TGCCAGAGCT GGTCTAGACC CGGAAAATAA CCCTCACGAC CGCCCTCAGT 650 

TCTTCGAA 658 



(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 219 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 
Leu Leu Leu Glu Thr Val Tyr Glu Ala Leu Glu Ala Gly Gly His 

5 ' 10 15 



Thr He Glu Ala Leu Arg Gly Ser Asp Thr Ser Val Phe Thr Gly 

20 25 30 

Thr Met Gly Val Asp Tyr Asn Asp Thr Val He Arg Asp Leu Asn 

35 40 45 

Val He Pro Thr Tyr Phe Ala Thr Gly Val Asn Arg Ala He He 

50 55 60 

Ser Asn Arg Val Ser Tyr Phe Phe Asp Trp His Gly Pro Ser Met 

65 70 75 

Thr He Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Val His Gin 

80 85 90 

Gly Val Lys Ala Leu Arg Ser Gly Glu Ser Arg Thr Ala Leu Ala 

95 100 105 



Cys Gly Thr Gin Val He Leu Asn Pro Glu Met Tyr Val He Glu 
110 115 120 
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Ser Lys Leu Lys Met Leu Ser Pro Thr Gly Arg Ser Arg Met Trp 
125 130 135 

Asp Ala Asp Ala Asp Gly Tyr Ala Arg Gly Glu Gly Val Ala Ala 
140 145 150 

Val Val Leu Lys Arg Leu Ser Asp Ala He Ala Asp Gly Arg 

155 160 165 

He Glu Cys He He Arg Glu Thr Gly Ser Asn Gin Asp Gly His 
170 175 180 

Ser Asn Gly He Thr Val Pro Ser Thr Glu Ala Gin Ala Ala Leu 
185 190 195 

He His Gin Thr Tyr Ala Arg Ala Gly Leu Asp Pro Glu Asn Asn 
200 205 210 

Pro His Asp Arg Pro Gin Phe Phe Glu 
215 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 753 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:71: 
TGGGCTACTC GAGACTGCTT ACAAGGCGTT CGAAAACGGT GAGTCTTGAA 50 
GCTGCACAGA TCAAGACAAG AACACTAAAT CTCTCAGCGG GCATACGCAT 100 
AGAAGAAGCC GCTGGCTCTA GAACTTCAGT TCATATCGGG AGTTTCACTC 150 
ATGATTGGAG AGACATCCTC CAAAGGGATC CACTAATGGA TGTTAGCTAC 200 
ATAGCTACCG CAACCGAGGT TTCTATGCTA GCGAGTCGAC TCAGCTGGTT 250 
TTATGATCTA AGTGGGCCYA GCATCTCCTT GGATACAGCG TGTTCGAGTA 3 00 
GCTTAATGGC TTTACATCTC GCCTGCCAGA GTCTAAAGAG TCGAGAGGCC 350 
GACATGGTAA GGCTATGCTA CTTTCTGGCT CACTCAAACT GTTTTCCATA 400 
TCTGATGCTT GCACAGGGCC TTGTTGGGAG GGGCTAATCT TCTTTTGGAT 450 
CCTGTAGGGG TTATTGGCAT AACAAATGTT GGCATGCTTT CGCCAGATGG 500 
CATTAGTTAC AGCTTTGATC ATCGTGCAAA CGGGTATGCC CGAGGAGAAG 550 
GGTTCGGAGT CGTTGTCATC AAACGCTTGG ACGATGCTCT CAGACATGGC 600 
GATACTATTC GCGGTATCGT TCGTGCCACA GGATCGAATC AAGATGGAAG 650 
AACTCCAGGG ATTACCCAAC CTGATGGAGC CGCGCAAGAA GAGCTCATCC 700 
GAGACACTTA CAAAGCTGCT GGCTTAGATA TGAGGCTAGT AAGGTATTCT 750 
TAA 753 



(2) INFORMATION FOR SEQ ID NO: 72: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 213 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 72 : 
Gly Leu Leu Glu Thr Ala Tyr Lys Ala Phe Glu Asn Ala Gly He 

5 10 15 

Arg He Glu Glu Ala Ala Gly Ser Arg Thr Ser Val His He Gly 

20 25 30 

Ser Phe Thr His Asp Trp Arg Asp He Leu Gin Arg Asp Pro Leu 

35 40 45 

Met Asp Val Ser Tyr He Ala Thr Ala Thr Glu Val Ser Met Leu 

50 55 60 

Ala Ser Arg Leu Ser Trp Phe Tyr Asp Leu Ser Gly Pro Ser He 

65 70 75 

Ser Leu Asp Thr Ala Cys Ser Ser Ser Leu Met Ala Leu His Leu 

80 85 90 

Ala Cys Gin Ser Leu Lys Ser Arg Glu Ala Asp Met Gly Leu Val 

95 100 105 

Gly Gly Ala Asn Leu Leu Leu Asp Pro Val Gly Val He Gly He 
110 115 120 

Thr Asn Val Gly Met Leu Ser Pro Asp Gly He Ser Tyr Ser Phe 
125 130 135 

Asp His Arg Ala Asn Gly Tyr Ala Arg Gly Glu Gly Phe Gly Val 
140 145 150 

Val Val He Lys Arg Leu Asp Asp Ala Leu Arg His Gly Asp Thr 
155 160 165 

He Arg Gly He Val Arg Ala Thr Gly Ser Asn Gin Asp Gly Arg 
170 175 180 

Thr Pro Gly He Thr Gin Pro Asp Gly Ala Ala Gin Glu Glu Leu 
185 190 195 

He Arg Asp Thr Tyr Lys Ala Ala Gly Leu Asp Met Arg Leu Val 
200 205 210 



Arg Tyr Ser 



(2) INFORMATION FOR SEQ ID NO: 73: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 753 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:73: 
ATTGTTGCTC GAAGTAACCT ATGAAGCTTT AGAGAACGGT GGGTAGTTCC 50 
AGGAAGCATT AATCAAGACA AAGCTATTGC TCACACTTTT CCAAAATAGC 100 
CGGAATACCC TTGAACCAAA TTGTGGGCCA GGATGTTGGG GTTTTTGTTG 150 
GCGGCTCAAT GTCCGACTAC CAGAACCTCC TCCACAAAGA CATCGCAAAT 2 00 
GGTCCTATTT ACCAAGCCAC TGGCACTGCC ATGAGCTTCC TAGCCAACCG 250 
AATATCTTAC ATCTATGACC TCAAGGGCCC AAGCGTAACA GTGGACACTG 3 00 
CATGCTCCTC GGGTCTCACG GCACTTCATT TAGCATGCCA GAGCATACGC 3 50 
ACTGGTGAGA TCCGACAAGC TTTGGTCGGC GGTGTATACA TTATCCTAAG 4 00 
CCCGGAGAAT ATGATTGCCA TGAGCATGCT GGGGTGATGT CTCCTGTTCC 4 50 
AGAAAGTAAT TGATAAAAGC TAATGCCAGT AGACTGTTTG GCACCGACGG 500 
TCTCTCATAC AGCTATGATC ACCGAGCAAC TGGATATGGA CGTGGTGAAG 550 
GAGGAGGCAT GATAGTCTTA AAGTCGCTAG ACGACGCGAT GGCAAACGGA 600 
GACACAATAC ATGCGGTAAT TCGGCACACA GGGACAAATC AGGATGGTAA 650 
GACCAGCGGC CCAACAATGC CCAGTCTGGA AGCCCAGGAG AGACTCATCA 700 
AGAAAGTTTA CAGCCAGGCT GGTCTGGATC CATTGGATAC AGAATATGTC 750 
GAG '753 

(2) INFORMATION FOR SEQ ID NO:74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 214 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 
Leu Leu Leu Glu Val Thr Tyr Glu Ala Leu Glu Asn Ala Gly He 

5 10 15 

Pro Leu Asn Gin He Val Gly Gin Asp Val Gly Val Phe Val Gly 

20 25 30 

Gly Ser Met Ser Asp Tyr Gin Asn Leu Leu His Lys Asp He Ala 

35 40 45 

Asn Gly Pro He Tyr Gin Ala Thr Gly Thr Ala Met Ser Phe Leu 

50 55 60 

Ala Asn Arg He Ser Tyr He Tyr Asp Leu Lys Gly Pro Ser Val 

65 70 75 

Thr Val Asp Thr Ala Cys Ser Ser Gly Leu Thr Ala Leu His Leu 

80 85 90 

Ala Cys Gin Ser He Arg Thr Gly Glu He Arg Gin Ala Leu Val 

95 100 105 

Gly Gly Val Tyr He He Leu Ser Pro Glu Asn Met He Ala Met 
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110 115 120 

Ser Met Leu Gly Leu Phe Gly Thr Asp Gly Leu Ser Tyr Ser Tyr 
125 130 l-i^ 

ASP His Arg Ala Thr Gly Tyr Gly Arg Gly Glu Gly Gly Gly Met 
140 115 1=>" 

He Val Leu Lys Ser Leu Asp Asp Ala Met Ala Asn Gly Asp Thr 
155 160 16 = 

He His Ala val He Arg His Thr Gly Thr Asn Gin Asp Gly Lys 
170 175 180 

Thr Ser Gly Pro Thr Met Pro Ser Leu Glu Ala Gin Glu Arg Leu 
185 190 195 

He Lys Lys Val Tyr Ser Gin Ala Gly Leu Asp Pro Leu Asp Thr 
.200 205 210 

Glu Tyr Val Glu 



(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQXra:NCE CHARACTERISTICS: 

(A) LENGTH: 692 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:75: 
AATGCTGCTT GAGGTAGTCT ATGAGGCGTT AGAAGACGGT AAGTCTAACG 50 
AATTTCAATC AGTGGTCCTG AGCTAATTGC GATCAAGCTG GCATTACGCT 100 
CGACGACATT AAGGGTTCCC AGACATCTGT CTACTGTGGG AGCTTCACCA 150 
ACGACTACCG TGAAATGCTG AACAAAGATT TGGGGTACTA CCCCAAGTAC 200 
ATGGCCACTG GTGTTGGAAA CTCCATCTTA GCCAACCGCA TTTCATATTT 250 
CTATGACCTA CACGGACCAA GTGTGACTGT CGACACAGCC TGCTCTCTTC 300 
CCCTGGTCTC ATTCCATATG GGCAACAGAT CAATCCMAGA TGGAGATGCT 350 
GACATCTCAA TCGTCATTGG ATCTTCGCTC CATTTTGATC CCAACATGTT 400 
CGTCACTATG ACGGACCTTG GGTTTCTCTC AACCGACGGC AGATGCCGTG 450 
CTTTTGACGC TAGCGGAAAG GGGTATGTCC GCGGTGAGGG CATCTGCGCT 500 
GTTGTTTTGA AACAAAAATC ACGCGCTGAA CTTCACGACA ACAACGTTCG 550 
ATCCGTCATT CGTGGCTCGG ATGTCAACCA CGACGGTGCC AAAGACGGTA 600 
TCACAATGCC AAACTCGAAG GCTCAGGAGA GCCTCATCAG AAAGACCTAC 650 
AAAAACGCTG GACTGAGTAC AAACGACACC CAGTACTTTG AG 692 



(2) INFORMATION FOR SEQ ID NO: 76: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 214 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 
Met Leu Leu Glu Val Val Tyr Glu Ala Leu Glu Asp Ala Gly lie 

5 10 15 

Thr Leu Asp Asp lie Lys Gly Ser Gin Thr Ser Val Tyr Cys Gly 

20 25 30 

Ser Phe Thr Asn Asp Tyr Arg Glu Met Leu Asn Lys Asp Leu Gly 

35 40 45 

Tyr Tyr Pro Lys Tyr Met Ala Thr Gly Val Gly Asn Ser lie Leu 

50 55 60 

Ala Asn Arg lie Ser Tyr Phe Tyr Asp Leu His Gly Pro Ser Val 

65 70 75 

Thr Val Asp Thr Ala Cys Ser Leu Pro Leu Val Ser Phe His Met 

80 85 90 

Gly Asn Arg Ser lie Xaa Asp Gly Asp Ala Asp lie Ser lie Val 

95 100 105 

lie Gly Ser Ser Leu His Phe Asp Pro Asn Met Phe Val Thr Met 
110 115 120 

Thr Asp Leu Gly Phe Leu Ser Thr Asp Gly Arg Cys Arg Ala Phe 
125 130 135 

Asp Ala Ser Gly Lys Gly Tyr Val Arg Gly Glu Gly lie Cys Ala 
140 145 150 

Val Val Leu Lys Gin Lys Ser Arg Ala Glu Leu His Asp Asn Asn 
155 160 165 

Val Arg Ser Val lie Arg Gly Ser Asp Val Asn His Asp Gly Ala 
170 175 180 



Lys Asp Gly lie Thr Met Pro Asn Ser Lys Ala Gin Glu Ser Leu 
185 190 195 

lie Arg Lys Thr Tyr Lys Asn Ala Gly Leu Ser Thr Asn Asp Thr 
200 205 210 

Gin Tyr Phe Glu 

(2) INFORMATION FOR SEQ ID NO: 77: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 690 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 
TATTTTATTG GAGACAACAT ACGAAGCACT TGAAAATAGT GAGTAAGCCA 50 
TGACCGTATT AAGTAAAAGC TCACGAACAG TAAAGGTGGC ACCCCTCTGG 100 
CTAGCATTCG CGGCCAAAAT GTAGGCGTTT ACGTTGGTGC ATCCATGTCA 150 
GACTACAACG AGCTTTTCGC AAAGGACCCG GATACCAATT TGACATATCG 200 
TATTACCGGA ACTGCATCAA ATATTTTGTC AAATCGACTC TCCTACATGT 250 
TCGACCTTCA CGGGCCAAGT TTCACGGTGG ACACTGCGTG CTCATCAAGC 300 
TTGGCCGCAT TCCATCTGGC CTGTCAGAGT TTGAAGACGG GAGAGGTCCG 350 
GCAAGCCATC GTGGGCGGGG CTTACCTTGT ATTATCCCCA GATCCTACGA 400 
TCGGAATGAG CAAACTCAGG CTTTACGGCG AACATGGTCG CTCATACACT 450 
TACGATCACC GAGGGACTGG ATACGGTCGT GGCGAGGGCG TCGCTAGCCT 500 
AATTCTTAAG CCTTTACAAG ATGCTATCGA CGTGGGTGAT ACAATTCGAG 550 
CAATCATACG TAACACTGGA ATGAATCAAG ACGGGAAGAC GAACGGAATT 600 
ACGCTCCCAA GCAAAGACGC CCAAGAAAGC CTCATAAGGT CTGTCTACAC 650 
AGCTGCAGGT CTCGATCCAC TGTATACTTC CTACGTTGAG 690 

(2) INFORMATION FOR SEQ ID NO:78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 214 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 
He Leu Leu Glu Thr Thr Tyr Glu Ala Leu Glu Asn Ser Gly Thr 

5 10 15 

Pro Leu Ala Ser He Arg Gly Gin Asn Val Gly Val Tyr Val Gly 

20 25 30 

Ala Ser Met Ser Asp Tyr Asn Glu Leu Phe Ala Lys Asp Pro Asp 

35 40 45 

Thr Asn Leu Thr Tyr Arg He Thr Gly Thr Ala Ser Asn He Leu 

50 55 60 

Ser Asn Arg Leu Ser Tyr Met Phe Asp Leu His Gly Pro Ser Phe 
. 65 70 75 

Thr Val Asp Thr Ala Cya Ser Ser Ser Leu Ala Ala Phe His Leu 

80 85 90 

Ala Cys Gin Ser Leu Lys Thr Gly Glu Val Arg Gin Ala He Val 

95 100 105 

Glv Gly Ala Tyr Leu Val Leu Ser Pro Asp Pro Thr He Gly Met 
110 115 120 



Ser Lys Leu Arg Leu Tyr Gly Glu His Gly Arg Ser Tyr Thr Tyr 
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130 135 



Asp His Arg Gly Thr Gly Tyr Gly Arg Gly Glu Gly Val Ala Ser 

140 '^^^ 

Leu He Leu Lys Pro Leu Gin Asp Ala He Asp Val Gly Asp Thr 

•trr InlJ iO-J 



155 160 

He 
170 



lie Arg Ala He He Arg Asn Thr Gly Met Asn Gin Asp Gly Lys 
170 1*^^ 

Thr Asn Gly He Thr Leu Pro Ser Lys Asp Ala Gin Glu Ser Leu 
185 150 



He Arg Ser Val Tyr Thr Ala Ala Gly Leu Asp Pro Leu Tyr Thr 

205 210 



200 

Ser Tyr Val Glu 



(2) INFORMATION FOR SEQ ID NO: 79: 
(il SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 761 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 
GCGAATGCTA GAGACGGCTT ATCACGCTCT GGAGGACGGT AAGTCTAACC 50 
AGTGCAAATT TAGGGGCTAT AATCTTGGTG TGTGAGAATA ACATACCATC 100 
AGCGAGCATC CCCCTGGAGA AGTGCTTCGG CTCAGACACT TCCGTTTATA 150 
C?GGG?§CTT ScCAACGAT TATCTCAGCA TACTGCAGCA AGACTTTGAG 2 00 
GCTGAGCAAA GGCACGCAGC CATGGGAATC GCGCCCTCCA TGTTGGCCAA 2 50 
TCGCCTAAGC TGGTTCTTCA ACTTCAAGGG GACATCGATG AACCTGGATT 3 00 
CGGCCTGCTC CAGCAGTCTG GTTGCACTGC ATCTTGCTTC ACAGGACCTC 3 50 
CGTGCTGGTA CCACATCGAT GGTATGTATC GATCATAAAA TCACGTACTC 4 00 
OTTCATTAAT AAATAAATGT TTTAGGCACT AGTTGGAGGG GCGAATCTTG 450 
TCTACCACCC CGACTTCATG GAGATGATGT CAAACTTCAA CTTCCTGTCT 500 
CCCGACAGCC GTTCTTGGAG TTTCGATCAA CGTGCTAATG GTTATGCGCG 550 
5SggSgGA ACCGCCGTGA TGGTCGTCAA ACGCCTTGCA GATGCACTGC 600 
GAGATGGAGA TACAATCAGA ACCGTAATCT GGAGTACCGG GTCGAACCAA 650 
GACGGGAGAA CACCTGGGAT CACGCAGCCA AGTAAAGAAG CGCAGTTAAA 700 
?CT?ATCGAG CGCACCTACA AACAAGCGAA GATTGATATG GAGCCTACCA 750 
GATTCTTCGA G 

(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 214 

(B) TYPE: amino acid 
{ D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 
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(iii) HYPOTHETICAL: no 
(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:80: 
Arg Met Leu Glu Thr Ala Tyr His Ala Leu Glu Asp Ala Ser He 

5 10 15 

Pro Leu Glu Lys Cys Phe Gly Ser Asp Thr Ser Val Tyr Thr Gly 

20 25 30 

Cys Phe Thr Asn Asp Tyr Leu Ser He Leu Gin Gin Asp Phe Glu 

35 40 45 

Ala Glu Gin Arg His Ala Ala Met Gly He Ala Pro Ser Met Leu 

50 55 60 

Ala Asn Arg Leu Ser Trp Phe Phe Asn Phe Lys Gly Thr Ser Met 

65 70 75 

Asn Leu Asp Ser Ala Cys Ser Ser Ser Leu Val Ala Leu His Leu 

80 85 90 

Ala Ser Gin Asp Leu Arg Ala Gly Thr Thr Ser Met Ala Leu Val 

95 100 105 

Gly Gly Ala Asn Leu Val Tyr His Pro Asp Phe Met Glu Met Met 
110 115 120 

Ser Asn Phe Asn Phe Leu Ser Pro Asp Ser Arg Ser Trp Ser Phe 
125 130 135 

Asp Gin Arg Ala Asn Gly Tyr Ala Arg Gly Glu Gly Thr Ala Val 
140 145 150 

Met Val Val Lys Arg Leu Ala Asp Ala Leu Arg Asp Gly Asp Thr 
155 160 165 



He Arg Thr Val He Trp Ser Thr Gly Ser Asn Gin Asp Gly Arg 

170 175 180 

Thr Pro Gly He Thr Gin Pro Ser Lys Glu Ala Gin Leu Asn Leu 

185 190 195 

He Glu Arg Thr Tyr Lys Gin Ala Lys He Asp Met Glu Pro Thr 

200 205 210 



Arg Phe Phe Glu 

(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1221 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 



wo 98/53097 



PCT/CA98/00488 



-78- 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 
AAGGAGGGGC CGCCCGGGAG AAGAAGTTAT CGTGGGCGCC GATTCGGTCG 50 
ACCGGCAGCA ATTGCAGCCA GATTGCCGCG AGGGCTTCCT CCATTCCCGG 100 
CGCGGGCGCA ACGAATCCGG TGTACTCCAG ATGCCGTGCG GTCCGGGGGA 150 
GAGCTGCCTG ATCCAGTTTG AGATTCTTGT TTAAAGGAAG TTCGGCCAGC 200 
TTCTCTATGG CGGCGGGGAC CATGTGAGCG GGGAGCAGAG CCTTCATGTG 250 
CTGGCGAATC GTTTCCGTGG ACGCTCCGCC GACTGCATAC GCCGCGAGAT 300 
ACTTCTCGCC GGGGATATCG TCTCGGACCA GCACAACGCC GTCCGTGACG 3 50 
CCCGGGCACG ACTGCAGCGC GGCCTGAATT TCGCCGAGTT CTATGCGATG 400 
CCCGCGAAGC TTGATCTGGC CGTCGTTTCT GCCCAGAAAA TCGATGCGCC 4 50 
CATCCGGCAG ATAGCGCGCG CGATCGCCCG TGCGGTACAT ACGCGCGCCC 500 
GGAAATGGGC TAAACGGGTT CGGCACAAAG TAGGCTGCGG TGAGATCGCT 550 
GCGCCCCGCA TAGCCGCGCG CGACACCGTC TCCGGCAGCG TACAGCCAGC 600 
CTTCCACTCC CGGCGGAACG GGAGCGAATT GCTCGTCGAG CACGTAGGTT 650 
TGGACGTTCG AAATTGGACG GCCGATGGGA ATCGACGGGG TCCCGGCGGG 700 
GACCGAATCG ATGACGCCAC ACGCCGTGAG CATCGTGTTC TCGGTAGGGC 750 
CGTAACCGTT CAAGAGGCGG GCGGGCTTGC CGTGCTCGAT CACCATGCGC 800 
ATCCAGTGGG GATCCAGCGC TTCGCCGCCG ACAATCACAT TGGTCAGCGA 850 
TTCGAATCCG GCTGGATCTT CGCGGGCAAC CTGATTGAAC AGAGATGCAG 900 
TAAGGATAAT CGTGTCCACG TGGAAGCGGC GAAAGGCGAG AATCAGCTCG 1000 
CGGGGCGCCA TCAAGGTCTC TTTCGAAAGA ACGACGATTC GCGCGCCATG 1050 
CAGCAGGCCG CCCCATAACT CGAAGGTGGG AGGGTCGAAA CCGAAGGCCG 1100 
ACATCTGTCC CACGGTATCG GCGGGTGAGA ATTGTACGTA GTTGGTCCGG 1150 
CTAACGAGGT TGACAATCGC CCCGTGGGGG ACGGCGACCC CCTTGGGCTT 1200 
GCCGGTCGTG CCGGACGTGT A 1221 

(2) INFORMATION FOR SEQ ID NO:82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 390 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 
Tyr Thr Ser Gly Thr Thr Gly Lys Pro Lys Gly Val Ala Val Pro 

5 10 15 

His Gly Ala He Val Asn Leu Val Ser Arg Thr Asn Tyr Val Gin 

20 25 30 

Phe Ser Pro Ala Asp Thr Val Gly Gin Met Ser Ala Phe Gly Phe 

35 40 45 

Asp Pro Pro Thr Phe Glu Leu Trp Gly Gly Leu Leu His Gly Ala 

50 55 60 

Arq He Val Val Leu Ser Lys Glu Thr Leu Met Ala Pro Arg Glu 

65 70 75 



Leu He Leu Ala Phe Arg Arg Phe His Val Asp Thr He He Leu 

80 85 90 



wo 98/53097 PCT/CA98/00488 

-79- 

Thr Ala Ser Leu Phe Asn Gin Val Ala Arg Glu Asp Pro Ala Gly 

95 100 105 

Phe Glu Ser Leu Thr Asn Val He Val Gly Gly Glu Ala Leu Asp 
110 115 120 

Pro His Trp Met Arg Met Val He Glu His Gly Lys Pro Ala Arg 
125 130 135 

Leu Leu Asn Gly Tyr Gly Pro Thr Glu Asn Thr Met Leu Thr Ala 
140 145 150 

Cvs Gly Val He Asp Ser Val Pro Ala Gly Thr Pro Ser He Pro 
^ 155 160 165 

He Glv Arq Pro He Ser Asn Val Gin Thr Tyr Val Leu Asp Glu 
170 175 180 

Gin Phe Ala Pro Val Pro Pro Gly Val Glu Gly Trp Leu Tyr Ala 
185 190 195 

Ala Gly Asp Gly Val Ala Arg Gly Tyr Ala Gly Arg Ser Asp Leu 
200 205 210 

Thr Ala Ala Tyr Phe Val Pro Asn Pro Phe Ser Pro Phe Pro Gly 
215 220 225 

Ala Arg Met Tyr Arg Thr Gly Asp Arg Ala Arg Tyr Leu Pro Asp 
230 235 240 

Gly Arg He Asp Phe Leu Gly Arg Asn Asp Gly Gin He Lys Leu 
245 250 255 

Arg Gly His Arg He Glu Leu Gly Glu He Gin Ala Ala Leu Gin 
260 265 270 

Ser Cvs Pro Gly Val Thr Asp Gly Val Val Leu Val Arg Asp Asp 
275 288 285 

He Pro Gly Glu Lys Tyr Leu Ala Ala Tyr Ala Val Gly Gly Ala 
290 295 300 

Ser Thr Glu Thr He Arg Gin His Met Lys Ala Leu Leu Pro Ala 
305 310 315 

His Met Val Pro Ala Ala He Glu Lys Leu Ala Glu Leu Pro Leu 
320 325 330 

Asn Lys Asn Leu Lys Leu Asp Gin Ala Ala Leu Pro Arg Thr Ala 
335 340 345 

Ara His Leu Glu Tyr Thr Gly Phe Val Ala Pro Ala Pro Gly Met 
350 355 360 

Glu Glu Ala Leu Ala Ala He Trp Leu Gin Leu Leu Pro Val Asp 
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365 370 375 

Arg He Gly Ala His Asp Asn Phe Phe Ser Arg Ala Ala Pro Pro 

380 385 390 



(2) INFORMATION FOR SEQ ID NO: 83 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1222 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

CGTTTCACCC CAAGAATCTC AGACCATATA TCAGCAATGG CCTTCTCCCT 50 

GGCATTGCCC GGAGCGACAT AGATCGGATC CCGAATCACA GTATCGCGAT 100 

CAAATGGCGG CAGGGCGTTT CGGTCAATCT TGCCGTTCGG CGTTAAAGGG 150 

AGAGAATCGA CAATGACGAA" GGCGCTGGGC ACCATGTAGT CCGGCAGTTT 2 00 

TGCCTTCAGA TGGGCGCGCA ATTCGCTTAT TTCGGGAGCA CCTTCCCGTG 2 50 

CGACGATATA AGCAACTAAT TGCTTTTCTT CGCTAGGGTC TTTTGTCGTT 3 00 

GTGACCACAG CTTCTCGAAT CGGGGATGTT GCGCAACAGG ACTTCGATTT 3 50 

CTCCAGCTCG ATGCGATAGC CGCGAATCTT GACCTGATTG TCGGTGCGGC 4 00 

CGATAAACTC GATGTTGCCA TCCGGCAAAT AACGCGCAAG ATCGCCAGTT 4 50 

CGATAGAGGC GCTGCGCTGG CTCGCGATCG AATGAATGGT AGATGAACCT 5 00 

CTCCGCCGTC AGTTCCGGCC GGTTGAGATA CCCTCGCGCC AGTCCGTCGC 550 

CGCCAATGTA GATCTCTCCA ACCACGCCGA TCGGCACCGG ATTGAGATGA 600 

GCATCCAGTA TGTAGATCTG CGTATTCGCG ATCGGTCGGC CAATGGGCGG 650 

TAATTCTCCC CAGCACTCTG GCGGACCGTC CACAGTAAAC GCTGTCACAA 700 

CGTGGCTTTC CGTCGGCCCA TACTGGTTGA CCAAATGACA CTCGGGCAAC 750 

GTGTCAAGGA AACTTCTGAT CCGCGGCGTT ATCTGCAGCC GCTCTCCCGC 800 

CGTAATGACT TCGCGCAGCT GCGGCAAAAC CACATTCTCC ATGTGCGCGG 850 

CTTCCGCCAT CTGTTGCAGT ACGACAAAAG GCACAAAAAG TCTCTCTACT 900 

CGCTTCATTC GCAGGAAATT CAACAGGGCT GGCGGATCGC GTCGGATTTG 95 0 

CGCGGGCAGT AGCACCAGTG TGCCTCCTGA GCACCACGTG CTAAACATCT 1000 

CTTGAAACGA AACATCGAAA CTCAACGAGG CAAACTGTAA CGTTCGCGCC 1050 

GGCACCGAAC GAGAAAAATC CTCAATTTGC CACGCGATCA GGTTGGCAAG 1100 

CGCGCGGTGT TCCATCACCA CACCCTTCGG CTTGCCCGTC GTGCCAATCC 1150 

CGCGGCCATG GCGGCCGGGA GCATGCGACG TCGGGCCCAA TTCGCCCTAT 1200 

AGTGAGTCGT ATTACAATTC AA 1222 



(2) INFORMATION FOR SEQ ID NO: 84 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 396 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 
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Gly Thr Thr Gly Lys Pro Lys Gly Val Val Met Glu His Arg Ala 

5 10 15 

Leu Ala Asn Leu He Ala Trp Gin He Glu Asp Phe Ser Arg Ser 

20 25 30 

Val Pro Ala Arg Thr Leu Gin Phe Ala Ser Leu Ser Phe Asp Val 

35 40 45 

Ser Phe Gin Glu Met Phe Ser Thr Trp Cys Ser Gly Gly Thr Leu 

50 55 60 

Val Leu Leu Pro Ala Gin He Arg Arg Asp Pro Pro Ala Leu Leu 

65 70 75 

Asn Phe Leu Arg Met Lys Arg Val Glu Arg Leu Phe Val Pro Phe 

80 85 90 

Val Val Leu Gin Gin Met Ala Glu Ala Ala His Met Glu Asn Val 

95 100 105 

Val Leu Pro Gin Leu Arg Glu Val He Thr Ala Gly Glu Arg Leu 
110 115 120 

Gin He Thr Pro Arg He Arg Ser Phe Leu Asp Thr Leu Pro Glu 
125 130 135 

Cys His Leu Val Asn Gin Tyr Gly Pro Thr Glu Ser His Val Val 
140 145 150 

Thr Ala Phe Thr Val Asp Gly Pro Pro Glu Cys Trp Gly Glu Leu 
155 160 165 

Pro Pro He Gly Arg Pro He Ala Asn Thr Gin He Tyr He Leu 
170 175 180 

Asp Ala His Leu Asn Pro Val Pro He Gly Val Val Gly Glu He 
185 190 195 

Tyr He Gly Gly Asp Gly Leu Ala Arg Gly Tyr Leu Asn Arg Pro 
200 205 210 

Glu Leu Thr Ala Glu Arg Phe He Tyr His Ser Phe Asp Arg Glu 
215 220 225 

Pro Ala Gin Arg Leu Tyr Arg Thr Gly Asp Leu Ala Arg Tyr Leu 
230 235 240 

Pro Asp Gly Asn He Glu Phe He Gly Arg Thr Asp Asn Gin Val 
245 250 255 

Lys He Arg Gly Tyr Arg He Glu Leu Glu Lys Ser Lys Ser Cys 
260 265 270 

Cys Ala Thr Ser Pro He Arg Glu Ala Val Val Thr Thr Thr Lys 
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275 288 285 

Asp Pro Ser Glu Glu Lys Gin Leu Val Ala Tyr lie Val Ala Arg 
^ 290 295 300 

Glu Glv Ala Pro Glu lie Ser Glu Leu Arg Ala His Leu Lys Ala 
305 310 315 

Lvs Leu Pro Asp Tyr Met Val Pro Ser Ala Phe Val He Val Asp 
^ 320 325 330 

Ser Leu Pro Leu Thr Pro Asn Gly Lys He Asp Arg Asn Ala Leu 
335 340 345 

Pro Pro Phe Asp Arg Asp Thr Val He Arg Asp Pro He Tyr Val 
350 355 360 

Ala Pro Gly Asn Ala Arg Glu Lys Ala He Ala Asp He Trp Ser 
365 370 375 

Glu He Leu Gly Val Lys Arg He Gly Val His Asp Asn Phe Phe 
380 385 390 



Ala Pro Gly Gly Pro Ser 
395 



(2) INFORMATION FOR SEQ ID NO: 85 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1200 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 
AATCTACACG TCCGGCACCA CCGGCAAGCC CAAGGGGGCC ATAATCCATC 50 
ACCTGGGACT GGCGAATTAC TTGGTGTGGT GCTCGCGGGC TTACGCGATT 100 
GCTCAAGGAG TGGGAGCACC GGTCCACTCG TCGATCTCGT TCGATCTGAC 150 
GATCACTGCC TTGCTTGCCC CCTTGGTCGT CGGCCGGCGC ATCGACCTGC 2 00 
TTGATGAAGA ACTGGGCATC GAGCAACTGA GTTACGCTCT CCGGCGATCG 250 
CGCGACTATA GCCTGGTCAA GATCACTCCG GCTCACCTGC GCTGGCTCGG 300 
CGATGAACTG GGACCCTGCG AGGCCGAAGG TCGTACGCGA GCTTTCATCA 3 50 
TCGGTGGTGA GCAACTGACG GCCGAACACG TCKCATTCTG GAGGCGGCAC 400 
GCGCCGGGGA CGAGCCTGAT CAACGAGTAT GGTCCGACCG AGACGGTCGT 450 
CGGCTGCTGC GTGTACCGCG TGCCTCCTGA CCAGGAGATT TCGGGGCCCA 500 
TCCCGATTGG CCGACCGATC GCCAACACGC GTCTCTACGT CCTCGATCCG 550 
GATCTCGCGC TGGTACCCAT CGGCGTTGCA GGCGAGCTGT ACATCGGCGG 600 
TGCCGGGGTC GCGCGGGGGT ATCTCAACAG GCCCGGCCTG ACCGCTGAAA 650 
GGTTCATCCC CGACCCGTTC GGCAAGAAGC CGGGCGAGCG CCTCTATCGC 700 
ACCGGAGACC TCGCCCGATG GCGGTCCGAC GGTAACCTCG AGTATCTCGG 750 
CAGGGTCGAT CGCCAGGTTA AAGTCCGCGG GTTTCGGATC GAACCCGGGG 800 
AGATCGAACA GGCACTCGCC CGGCACTCCG CGGTACGCGA GTCCGTCGTG 850 
GTCGCAAGCG CAGGTGCATC GGACGTGCAA CGCCTCGTCG CCTATCTGGT 900 
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TCTTGCGGAG GCAGGGCCGG CACCGCCCGA CTCGGAGCTG CGCGAGTTCC 950 

TGCGGACGTT ACTCCCCGAG CCGATGATAC CCTCGGCATT CGTTGTGCTG 1000 

GAGACGCTCC CACTGACCCA CAACGGGAAG GTGGACCGAG AGGCCCTGCC 1050 

GGCCCCTGAG GGTGTGCCCT TCCGTGGGGA TGCTCGTTTC GTTGCTCCCC 1100 

GCGGCCCGCT CGAACAGGAG GTGGCATCGA TCTGGGGTGC AGTCCTCGGA 1150 
CTGGAGCGTA TCGGCGCCCT TGACAACTTC TTCTTCCCTC GGCGGCCCCT 1200 

(2) INFORMATION FOR SEQ ID NO: 86 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 399 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 
He Tyr Thr Ser Gly Thr Thr Gly Lys Pro Lys Gly Ala He He 

5 10 15 

His His Leu Gly Leu Ala Asn Tyr Leu Val Trp Cys Ser Arg Ala 

20 25 30 

Tyr Ala He Ala Gin Gly Val Gly Ala Pro Val His Ser Ser He 

35 40 45 

Ser Phe Asp Leu Thr He Thr Ala Leu Leu Ala Pro Leu Val Val 

50 55 60 

Gly Arg Arg He Asp Leu Leu Asp Glu Glu Leu Gly He Glu Gin 

65 70 75 

Leu Ser Tyr Ala Leu Arg Arg Ser Arg Asp Tyr Ser Leu Val Lys 

80 85 90 

He Thr Pro Ala His Leu Arg Trp Leu Gly Asp Glu Leu Gly Pro 

95 100 105 

Cvs Glu Ala Glu Gly Arg Thr Arg Ala Phe He He Gly Gly Glu 
110 115 120 

Gin Leu Thr Ala Glu His Val Xaa Phe Trp Arg Arg His Ala Pro 
125 130 135 

Gly Thr Ser Leu He Asn Glu Tyr Gly Pro Thr Glu Thr Val Val 
140 145 150 

Gly Cys Cys Val Tyr Arg Val Pro Pro Asp Gin Glu He Ser Gly 
155 160 165 

Pro He Pro He Gly Arg Pro He Ala Asn Thr Arg Leu Tyr Val 
170 175 180 

Leu Asp Pro Asp Leu Ala Leu Val Pro He Gly Val Ala Gly Glu 
185 190 195 
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Leu Tvr He Gly Gly Ala Gly Val Ala Arg Gly Tyr Leu Asn Arg 
200 205 210 

Pro Gly Leu Thr Ala Glu Arg Phe He Pro Asp Pro Phe Gly Lys 
215 220 225 

Lvs Pro Gly Glu Arg Leu Tyr Arg Thr Gly Asp Leu Ala Arg Trp 
230 235 240 

Arq Ser Asp Gly Asn Leu Glu Tyr Leu Gly Arg Val Asp Arg Gin 
^ 245 250 255 

Val Lys Val Arg Gly Phe Arg He Glu Pro Gly Glu He Glu Gin 
260 265 270 

Ala Leu Ala Arg His Ser Ala Val Arg Glu Ser Val Val Val Ala 
275 288 285 

Ser Ala Gly Ala Ser Asp Val Gin Arg Leu Val Ala Tyr Leu Val 
290 295 300 

Leu Ala Glu Ala Gly Pro Ala Pro Pro Asp Ser Glu Leu Arg Glu 
305 310 315 

Phe Leu Arg Thr Leu Leu Pro Glu Pro Met He Pro Ser Ala Phe 
320 325 330 

Val Val Leu Glu Thr Leu Pro Leu Thr His Asn Gly Lys Val Asp 
335 340 345 

Arg Glu Ala Leu Pro Ala Pro Glu Gly Val Pro Phe Arg Gly Asp 
350 355 360 

Ala Arg Phe Val Ala Pro Arg Gly Pro Leu Glu Gin Glu Val Ala 
365 370 375 

Ser He Trp Gly Ala Val Leu Gly Leu Glu Arg He Gly Ala Leu 
380 385 390 

Asp Asn Phe Phe Phe Pro Arg Arg Pro 
395 

(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1204 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 
AGGGGCCGCC GGGCGAGAAG AAGTTCGCGG TGATGCTCAC CGGCGCGTCG 50 
AGCTTCAACG CCTCCTGCCA GATCTCCGCG AGCTTGCTCT CCGTCTCCGT 100 
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GCCCGGCGCT ACGTATTGGG CGCCGGCGCT ACGGTCGATG GACGGCAGCG 15 0 
CCTTACGATC GATCTTGCCG TTGGCATTCA GCGGAAAGGC CTCCAGGACG 200 
CGCCAGCCGC TGGGAATCAT GTACTCGGGC AGGGCCAGCT TGAGGCGCAT 250 
CCGCAGCGCC GAGATGAGCA CCTCTTCGTC CGCGGTCTGG GCCACGACGT 300 
AGGCGACGAG GGCCTTGTTC TCCCCCTCTC CCTGCGCCAC GACCAGGGCG 350 
TCGTCGACGC CAGCCTCGGT CTTCAGCGCG GTCTCGATCT CGCCGAGCTC 4 00 
GATGCGGAAG CCGCGGATCT TGATCTGGTC GTCGAGGCGG CCGAGGAACT 450 
CGAGATCGCC GCTGGCGAGC CGGCGAACGA GGTCGCCGCT GCGATAGAGG 500 
CGCCCTTCGC CGAAGGGATT GGCGATGAAC TTCGCCGCCG TCAGCTCCGG 550 
CTGGTTGACG TAGCCTCTGG CCACCCCTGC CCCGCCAATG CACAGCTCGC 600 
CGGCCACGCC GACCGGCGCG ATCTCCAGTG CCTCGTTGAG GACATACAGC 650 
TCCGTGTTGT CCATGGCCCT GCCGATGGGC AGGCGCTCCG GCAGGCCGGC 700 
CTGGAGAGCG GCGGTGACGT CGAACATGGC GCAGCCGACC ACGGTCTCCG 750 
TGGGACCGTA GTGGTTGTAG ATCTGGGCGT GGGGGAAGCG CGTTTGCAGC 800 
TCGCGGGCGA GCGAGGCGGG AAACGATTCG CCGCCGATGA CGAAAACGTG 850 
TTGAGATGAA GCCCGGGCCG TGTCTTCCGT CAGCTCCGCG CTGTCGAGCA 900 
GAGCGAGCAT ACCGGTGAGA TGCATCGGCG TCATGCGCAG CAGATAAGCC 950 
CGTTCGTCGC CGGCCAACGC TTTCGCGAGC TCGTTCAACT CATCGCCGGG 1000 
CGTGGTCAGC GAGACGCAGC CACCCCGGAG CAAGGGAACA TACAGGCTGG 1050 
GCACGGTGAT GTCGAAGCCG TGGGAGGTGA CGAGGAGGGA GCCGGCCAAC 1100 
CCCTTCGCGT AGTAGCGCTG CGAAGCGAAG GCGCAGTAGT CACTGAGGCC 1150 
GGCGTGTCTG ATCTCCACGC CCTTCGGCTT GCCCGTCGTG CCGGACGTGT 1200 
AGAT 1204 

(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 401 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 
lie Tyr Thr Ser Gly Thr Thr Gly Lys Pro Lys Gly Val Glu lie 

5 10 15 

Arg His Ala Gly Leu Ser Asp Tyr Cys Ala Phe Ala Ser Gin Arg 

20 25 30 

Tyr Tyr Ala Lys Gly Leu Ala Gly Ser Leu Val Val Thr Ser His 

35 40 45 

Gly Phe Asp He Thr Val Pro Ser Leu Tyr Val Pro Leu Leu Arg 

50 55 60 

Gly Gly Cys Val Ser Leu Thr Thr Pro Gly Asp Glu Leu Asn Glu 

65 70 75 

Leu Ala Lys Ala Leu Ala Gly Asp Glu Arg Ala Tyr Leu Leu Arg 

80 85 90 



Met Thr Pro Met His Leu Thr Gly Met Leu Ala Leu Leu Asp Ser 

95 100 105 
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Ala Glu Leu Thr Glu Asp Thr Ala Arg Ala Ser Ser Gin His Val 
110 115 120 

Phe Val He Gly Gly Glu Ser Phe Pro Ala Ser Leu Ala Arg Glu 
125 130 135 

Leu Gin Thr Arg Phe Pro His Ala Gin He Tyr Asn His Tyr Gly 
140 145 150 

Pro Thr Glu Thr Val Val Gly Cys Ala Met Phe Asp Val Thr Ala 
155 160 165 

Ala Leu Gin Ala Gly Leu Pro Glu Arg Leu Pro He Gly Arg Ala 
170 175 180 

Met Asp Asn Thr Glu Leu Tyr Val Leu Asn Glu Ala Leu Glu He 
185 190 195 

Ala Pro Val Gly Val Ala Gly Glu Leu Cys He Gly Gly Ala Gly 
200 205 210 

Val Ala Arg Gly Tyr Val Asn Gin Pro Glu Leu Thr Ala Ala Lys 
215 220 225 

Phe He Ala Asn Pro Phe Gly Glu Gly Arg Leu Tyr Arg Ser Gly 
230 235 240 

Asp Leu Val Arg Arg Leu Ala Ser Gly Asp Leu Glu Phe Leu Gly 
245 250 255 

Arg Leu Asp Asp Gin He Lys He Arg Gly Phe Arg He Glu Leu 
260 265 270 

Gly Glu He Glu Thr Ala Leu Lys Thr Glu Ala Gly Val Asp Asp 
275 288 285 

Ala Leu Val Val Ala Gin Gly Glu Gly Glu Asn Lys Ala Leu Val 
290 295 300 

Ala Tyr Val Val Ala Gin Thr Ala Asp Glu Glu Val Leu He Ser 
305 310 315 

Ala Leu Arg Met Arg Leu Lys Leu Ala Leu Pro Glu Tyr Met He 
320 325 330 

Pro Ser Gly Trp Arg Val Leu Glu Ala Phe Pro Leu Asn Ala Asn 
335 340 345 

Gly Lys He Asp Arg Lys Ala Leu Pro Ser He Asp Arg Ser Ala 
350 355 360 

Gly Ala Gin Tyr Val Ala Pro Gly Thr Glu Thr Glu Ser Lys Leu 
365 370 375 

Ala Glu He Trp Gin Glu Ala Leu Lys Leu Asp Ala Pro Val Ser 
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380 385 390 

He Thr Ala Asn Phe Phe Ser Pro Gly Gly Pro 
395 400 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1190 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 
ATCTACACCT CGGGCACGAC CGGCAAGCCG AAGGGGATCA TGTATTCGCA 50 
TCGATACCTG TTGCATAATA TGCGCAACTA CGGCGACTTA TTTCAGGTCT 100 
CCCCCCACGA TCGCTGGAGT TGGTTGCATT CCTACAGCTA TGCTTCGGCG 150 
AATACTGATA TCCTTTGCCC GCTACTGCAC GGCGCCGCCG TCTGCCCTTG 200 
GAATTTGCAT CGTAATGGCC TATCGGGCTT AGCTCGTTGG CTCGCCGAGT 250 
CGCGAATCAC CATTTTGAAC TGGATGCCGA CACCGCTACG CAGTTTGGCA 3 00 
AAGCTCTGGC CGCCAAAGCA CGTGCTTCCC GATCTGCGAC TTACAGTGTT 3 50 
GGGCGGCGAA ACGCTGTTTG CCCAAGACGT TGCTGACTTT CGGCGAATAA 400 
TTTCGCTGAA TTGCCTAATC GCCAATCGTC TGGGAACTTC GGAAACTGGA 450 
TTGTTTCGGC TCGCGTTTCT CGACCGAGAG ACTCCCCTTG CTAATGGTTC 500 
CATACAGGCC GGATACGAAG TTCCAGACAA GACCGTCGTC CTGTTCGACG 550 
AATATGGAGT TGAGCTGGCC CCTGGCAACG TCGGTCAGAT TGGCGTGCGC 600 
AGCAGGTACT TGCCGCCTGG ATACTGGCGA CGGCCGGAGT TGACAAGCGA 650 
GCGATTTCTA ACCAGTAAAG GCGATGATGA CGTACGGACC TTCCTCACCG 700 
GCGACCTTGG GCGAATGCGG GACGACGGAT GCCTCGAGCA CTGCGGACGG 750 
CTCGACTCCC AAGTGAAGAT CCGTGGTCAC CGCATCGCAA TGGGAGAGAT 800 
CGAATTCTTG CTTCGGACAT GCGACGGAGT CAGCGAAGCA GTTGTCATTG 850 
CCAGGCCACA TTCAGACGGT GAAACCCGTT TGATAGCTTA TTTTGTGCCG 900 
ACCGAGAAAA GCGCTATCGA TGTATCGAGC CTTCGTCGGC ACCTGCTGGG 950 
AAAGCTGCCT GGCCACATGA TCCCCTCGGC GTTTGTGCGG CTCGACGGCG 1000 
TGCCCAAAAA CGCCAACCAA AAAGTAGATT GGGCGGCCTT GCCAGCACCG 1050 
AACTTCCAAA ACCAGGGACA GCAGCACGTA CCGCCACAAA CGCCTTGGCA 1100 
GCGACATCTC GTGGAGTTGT GGCAAAAGTT GTTGAATGTG GAATCGATCG 1150 
GCATCCACGA TGACTTCTTC GCCCTCGGCG GCCCCTCCTT 1190 

(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 96 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 
He Tyr Thr Ser Gly Thr Thr Gly Lys Pro Lys Gly He Met Tyr 

5 10 15 

Ser His Arg Tyr Leu Leu His Asn Met Arg Asn Tyr Gly Asp Leu 



wo 98/53097 



PCT/CA98/00488 



-88- 

20 25 30 

Phe Gin Val Ser Pro His Asp Arg Trp Ser Trp Leu His Ser Tyr 

35 40 45 

Ser Tyr Ala Ser Ala Asn Thr Asp He Leu Cys Pro Leu Leu His 

50 55 60 

Gly Ala Ala Val Cys Pro Trp Asn Leu His Arg Asn Gly Leu Ser 

65 70 75 

Gly Leu Ala Arg Trp Leu Ala Glu Ser Arg He Thr He Leu Asn 

80 85 90 

Trp Met Pro Thr Pro Leu Arg Ser Leu Ala Lys Leu Trp Pro Pro 

95 100 105 

Lys His Val Leu Pro Asp Leu Arg Leu Thr Val Leu Gly Gly Glu 
110 115 120 

Thr Leu Phe Ala Gin Asp Val Ala Asp -Phe Arg Arg. He . He Ser 
125 130 135 

Leu Asn Cys Leu He Ala Asn Arg Leu Gly Thr Ser Glu Thr Gly 
140 145 150 

Leu Phe Arg Leu Ala Phe Leu Asp Arg Glu Thr Pro Leu Ala Asn 
155 160 165 

Gly Ser He Gin Ala Gly Tyr Glu Val Pro Asp Lys Thr Val Val 
170 175 180 

Leu Phe Asp Glu Tyr Gly Val Glu Leu Ala Pro Gly Asn Val Gly 
185 190 195 

Gin He Gly Val Arg Ser Arg Tyr Leu Pro Pro Gly Tyr Trp Arg 
200 205 210 

Arg Pro Glu Leu Thr Ser Glu Arg Phe Leu Thr Ser Lys Gly Asp 
215 220 225 

Asp Asp Val Arg Thr Phe Leu Thr Gly Asp Leu Gly Arg Met Arg 
230 235 240 

Asp Asp Gly Cys Leu Glu His Cys Gly Arg Leu Asp Ser Gin Val 
245 250 255 

Lys He Arg Gly His Arg He Ala Met Gly Glu He Glu Phe Leu 
260 265 270 

Leu Arg Thr Cys Asp Gly Val Ser Glu Ala Val Val He Ala Arg 
275 288 285 

Pro His Ser Asp Gly Glu Thr Arg Leu He Ala Tyr Phe Val Pro 
290 295 300 
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Thr Glu Lys Ser Ala He Asp Val Ser Ser Leu Arg Arg His Leu 
305 310 315 

Leu Gly Lys Leu Pro Gly His Met He Pro Ser Ala Phe Val Arg 
320 325 330 

Leu Asp Gly Val Pro Lys Asn Ala Asn Gin Lys Val Asp Trp Ala 
335 340 345 

Ala Leu Pro Ala Pro Asn Phe Gin Asn Gin Gly Gin Gin His Val 
350 355 360 

Pro Pro Gin Thr Pro Trp Gin Arg His Leu Val Glu Leu Trp Gin 
365 370 375 

Lys Leu Leu Asn Val Glu Ser He Gly He His Asp Asp Phe Phe 
380 385 390 

Ala Leu Gly Gly Pro Ser 
395 

(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1178 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 
AAGGAGGGGC CGCCCGGCGC GAAGAAGTTC TCGTGTAGCC CGACGCGTTC 50 
CAGCTGCAGC ACGGCGCACC AGATCGCTGC GACCTGCCGC TGGACGTCCG 100 
TCATGATCGC GGTGTCCGCT GCGGCCGCTG CCGCGCGATT CACCTGTGGA 150 
ATGGGCAGGG CCTTGCGGTC GATCTTGTCG TTCGGCGTGA GCGGCAGCGC 200 
GGCGAGCGAT ACGATCACCT GTGGCACCAT GTACTCGGGG AGTCTCGCGC 250 
GGAGCGCCGT CCGGAGCTCG TCGAGCGGCA GCACGCCGTC TTCTGCCGGG 300 
ACGACGTACG CCACCAGACG CTGATCGCCG GGGGTGTCCT CGCGCACGAC 350 
GGCCACGCTG CGGCGCACCG ACGGATGCTC GGACAGGACC GATTCGATCT 400 
CCCCCAGCTC GATCCGGTAG CCGCGAAGCT TCACCTGATG ATCTCGGCGT 450 
CCGACGAACT CGAGGGCCCG ATCGGCGCGC AGTCGTACGA TGTCGCCGGT 500 
GCGGTACACG CGCTCCGCCG GTCTGCCCGC GACCTCGACG ACGACGAACT 550 
TTTCTGCCGT GAGCTCGGGT CGATGACGAT AGCCCCGCGC CACGCCCTCT 600 
CCTCCGATGC ACAGCTCACC CGGCACGCCG ATGGGAGCCT GGCGACCCGC 650 
GGCGTCGAGC ACGTAGACGT TCGTGTTGGC GATGGGATGG CCGATCGGAA 700 
TATCGCGATC GCAATCCGTG ACCTGATGCA CGGTCGACCA GATCGTCGTC 750 
TCGGTCGGGC CGTACATGTT CCACAGCGCC CGCACCCTCG ACGAGAGATC 800 
GCGCGCGAGA TCGCGTGGAA GGGCCTCCCC GCCGCAGAGC GCGGTGAGAT 850 
CCGTCTTGCC CTGCCAGCCG GCGTCGATGA GCAGGCGCCA GGTCGCGGGG 900 
GTCGCCTGCA TCATCGTCGC TCTGCACGAT TCGATGCGCT CGCGAAGACG 950 
CTCGCCGTCG AGCACGTCGC CGCGGGAGGC GATGACCGTC CTCCCGCCGA 1000 
CGACGAGAGG CAAGAACAGC TCGAGACCCG CGATGTCGAA CGACGGCGTG 1050 
GTGACCGCGA GGAGCACGTC GCCGGCTCGC AAGCCTGGCT CCTTCTGCAT 1100 
GGCGCGCAGG AAATTCACGA GCTGGCGGTG CTCGATCTCG ACCCCCTTCG 1150 
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GCTTGCCCGT CGTGCCCGAC GTGTAGAT 1178 



(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 92 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 
lie Tyr Thr Ser Gly Thr Thr Gly Lys Pro Lys Gly Val Glu lie 

5 10 15 

Glu His Arg Gin Leu Val Asn Phe Leu Arg Ala Met Gin Lys Glu 

20 25 30 

Pro Gly Leu Arg Ala Gly Asp Val Leu Leu Ala Val Thr Thr Pro 

35 - 40 . 45 

Ser Phe Asp lie Ala Gly Leu Glu Leu Phe Leu Pro Leu Val Val 

50 55 60 

Gly Gly Arg Thr Val lie Ala Ser Arg Gly Asp Val Leu Asp Gly 

65 70 75 

Glu Arg Leu Arg Glu Arg He Glu Ser Cys Arg Ala Thr Met Met 

80 85 90 

Gin Ala Thr Pro Ala Thr Trp Arg Leu Leu He Asp Ala Gly Trp 

95 100 105 

Gin Gly Lys Thr Asp Leu Thr Ala Leu Cys Gly Gly Glu Ala Leu 
110 115 120 

Pro Arg Asp Leu Ala Arg Asp Leu Ser Ser Arg Val Arg Ala Leu 
125 130 135 

Trp Asn Met Tyr Gly Pro Thr Glu Thr Thr He Trp Ser Thr Val 
140 145 150 

His Gin Val Thr Asp Cys Asp Arg Asp He Pro He Gly His Pro 
155 160 165 

He Ala Asn Thr Asn Val Tyr Val Leu Asp Ala Ala Gly Arg Gin 
170 175 180 

Ala Pro He Gly Val Pro Gly Glu Leu Cys He Gly Gly Glu Gly 
185 190 195 

Val Ala Arg Gly Tyr Arg His Arg Pro Glu Leu Thr Ala Glu Lys 
200 205 210 
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Phe Val Val Val Glu Vai Ala Gly Arg Pro Ala Glu Arg Val Tyr 
215 220 225 

Arg Thr Gly Asp lie Val Arg Leu Arg Ala Asp Arg Ala Leu Glu 
230 235 240 

Phe Val Gly Arg Arg Asp His Gin Val Lys Leu Arg Gly Tyr Arg 
245 250 255 

He Glu Leu Gly Glu He Glu Ser Val Leu Ser Glu His Pro Ser 
260 265 270 

Val Arg Arg Ser Val Ala Val Val Arg Glu Asp Thr Pro Gly Asp 
275 288 285 

Gin Arg Leu Val Ala Tyr Val Val Pro Ala Glu Asp Gly Val Leu 
290 295 300 

Pro Leu Asp Glu Leu Arg Thr Ala Leu Arg Ala Arg Leu Pro Glu 
305 310 315 

Tyr Met Val Pro Gin Val He Val Ser Leu Ala Ala Leu Pro Leu 
320 325 330 

Thr Pro Asn Asp Lys He Asp Arg Lys Ala Leu Pro He Pro Gin 
335 340 345 

Val Asn Arg Ala Ala Ala Ala Ala Ala Asp Thr Ala He Met Thr 
350 355 360 

Asp Val Gin Arg Gin Val Ala Ala He Trp Cys Ala Val Leu Gin 
365 370 375 

Leu Glu Arg Val Gly Leu His Glu Asn Phe Phe Ala Pro Gly Gly 
380 385 390 

Pro Ser 



(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1178 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 
ATCTACACCT CCGGCACGAC GGGCAAGCCG AAGGGAGTAA AGATCACACA 50 
TCGTGCCGTG GTGAATTTTC TGAACTCGAT GCGGCGTGAA CCAGGGCTGA 100 
CCCCGGACGA TGTGGTGCTC TCGGTCACCA CGCTGTCGTT TGACATTGCC 150 
GGACTCGAAC TCCACCTGCC CCTGACGACT GGAGCCACGG TCGTAGTGGC 200 
GACCCAAGAC GCGGTGTCCG ACGCTGAACT GCTGGTCAGA GAGTTGGAGC 250 
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GGACCGGAAC AACTCTGTTG CAGGCGACGC CAGTCACATG GCGAATGCTT 300 
CTGGAGTCGG GCTGGAAAGG AAATCCGCGA CTCAAGGCTC TGGTCGGAGG 350 
TGAGGCAGTG CCGAGGGACC TGGTGAATCG GCTTGCTCCC CTTTGCGCGT 400 
CACTTTGGAA CATGTACGGA CCAACGGAAA CCACGATCTG GTCAACGGTT 450 
GGGCGTCTGG AGGCTGGAGA TGGTGTGTCT AGTATTGGCC GGCCCATCGA 500 
CAATACGCGG ATTTACGTCG TGGATCCGTC GATACACCTT CAGCCCATCG 550 
GAGTTCCCGG CGAATTGCTG ATTGGCGGAG AAGGATTGGC CGACGGATAT 600 
CTGAAACGCG ATCAGTTGAC GGCAGAGAAG TTCATTCCTG ATCCATTTGG 650 
TGGGAGGCCT GGGTCTCGGC TGTATCGAAC CGGAGATCTT GCGCGCTGGC 700 
GCGCGGACGG CACCTTGGAG TGTCTCGGAC GAATGGACCA ACAGGTGAAG 750 
ATTCGGGGTT CCCGGATCGA ATTGGGTGAG ATCGAAACCC TGTTGGCCTC 800 
CCACCCGGAT GTGAAACAGA ACGTGGTGGT CGTACGCGAG GACAGCCCCG 850 
GGGAAAAAAA ATTGGTGGGC TATTTCGTGC CGGCGAACGG ACGCAATCCC 900 
GAAGTGATGG AATTTCGCAA ACATCTGCAG CGGACGCTTC CGGATTACAT 950 
GGTCCCCTCA GTGTACGTGC CCTTGACCTC GGTTCCGCTT ACACCCAACG 1000 
GAAAGATCGA CCGCAAGGCG CTGCCCGCAC CGGATATCAG CGCCGTCACG 1050 
GTTTCCCGAG AGTCAATTGC GCCGCGCAAT CCCGCCGAAG AGCGGCTGGC 1100 
AGCAATTTTC GCCAAGGTGC TTGGCACGCC GATCGCCTCG ATCCACGACA 1150 
GCTTCTTCTC CCCGGGCGGC CCCTCCAT 1178 

(2) INFORMATION FOR SEQ ID NO: 94 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 218 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 
lie Tyr Thr Ser Gly Thr Thr Gly Lys Pro Lys Gly Val Lys lie 

5 10 15 

Thr His Arg Ala Val Val Asn Phe Leu Asn Ser Met Arg Arg Glu 

20 25 30 

Pro Gly Leu Thr Pro Asp Asp Val Val Leu Ser Val Thr Thr Leu 

35 40 45 

Ser Phe Asp lie Ala Gly Leu Glu Leu His Leu Pro Leu Thr Thr 

50 55 60 

Gly Ala Thr Val Val Val Ala Thr Gin Asp Ala Val Ser Asp Ala 

65 70 75 

Glu Leu Leu Val Arg Glu Leu Glu Arg Thr Gly Thr Thr Leu Leu 

80 85 90 

Gin Ala Thr Pro Val Thr Trp Arg Met Leu Leu Glu Ser Gly Trp 

95 100 105 

Lys Gly Asn Pro Arg Leu Lys Ala Leu Val Gly Gly Glu Ala Val 
110 115 120 



Pro Arg Asp Leu Val Asn Arg Leu Ala Pro Leu Cys Ala Ser Leu 
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125 130 135 

Trp Asn Met Tyr Gly Pro Thr Glu Thr Thr lie Trp Ser Thr Val 
^ 140 145 150 

Glv Arq Leu Glu Ala Gly Asp Gly Val Ser Ser He Gly Arg Pro 
155 160 165 

lie Asp Asn Thr Arg He Tyr Val Val Asp Pro Ser He His Leu 
170 175 180 

Gin Pro He Gly Val Pro Gly Glu Leu Leu He Gly Gly Glu Gly 
185 190 195 

Leu Ala Asp Gly Tyr Leu Lys Arg Asp Gin Leu Thr Ala Glu Lys 
200 205 210 

Phe He Pro Asp Pro Phe Gly Gly Arg Pro Gly Ser Arg Leu Tyr 
215 220 225 

Thr Gly Asp Leu Ala Arg Trp Arg Ala Asp Gly Thr Leu Glu 

230 235 240 

Cvs Leu Gly Arg Met Asp Gin Gin Val Lys He Arg Gly Ser Arg 
245 250 255 

Glu Leu Gly Glu He Glu Thr Leu Leu Ala Ser His Pro Asp 

260 . 265 270 

Lys Gin Asn Val Val Val Val Arg Glu Asp Ser Pro Gly Glu 
^ 275 288 285 

Lvs Lys Leu Val Gly Tyr Phe Val Pro Ala Asn Gly Arg Asn Pro 
^ ^ 290 295 300 

Glu Val Met Glu Phe Arg Lys His Leu Gin Arg Thr Leu Pro Asp 
305 310 315 

Tvr Met Val Pro Ser Val Tyr Val Pro Leu Thr Ser Val Pro Leu 
320 325 330 

Thr Pro Asn Gly Lys He Asp Arg Lys Ala Leu Pro Ala Pro Asp 
335 340 345 

He Ser Ala Val Thr Val Ser Arg Glu Ser He Ala Pro Arg Asn 
350 355 350 

Pro Ala Glu Glu Arg Leu Ala Ala He Phe Ala Lys Val Leu Gly 
365 370 375 

Thr Pro He Ala Ser He His Asp Ser Phe Phe Ser Pro Gly Gly 
380 385 390 
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CLAIMS 



1 1 . A method for recovery of antibiotic biosynthetic DNA from humic 

2 materials or lichen comprising the steps of: 

3 (a) combining a humic or lichen-derived sample with a set of 

4 amplification primers under conditions suitable for polymerase chain reaction amplification, 

5 wherein the primer set is a degenerate primer set selected to hybridize with conserved regions 

6 of antibiotic biosynthetic gene; 

7 (b) cycling the combined sample through a plurality of amplification 

8 cycles to amplify DNA complementary to the primer set; and 

9 (c) isolating the amplified DNA. 

1 2. The method according to claim 1 , wherein the primer set hybridizes 

2 with a polyketide synthase gene. 

1 3. The method according to claim 2, wherein the primer set comprises 

2 primers having the sequence set forth in SEQ ID Nos. 1 and 2. 

1 4. The method according to claim 2, wherein the primer set comprises 

2 primers having the sequence set forth in SEQ ID Nos. 3 and 4. 

1 5 . The method according to claim 2, wherein the primer set comprises 

2 primers having the sequence set forth in SEQ ID Nos. 5 and 6. 

1 6. The method according to claim 2, wherein the primer set comprises 

2 primers having the sequence set forth in SEQ ID Nos. 1 1 and 12. 



1 
2 



7. The method according to claim 1 , wherein the primer set hybridizes 
with a isopenicillin N synthase gene. 
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1 8. The method according to claim 7, wherein the primer set comprises 

2 primers having the sequence set forth in SEQ ID Nos. 7 and 8. 

1 9. The method according to claim 1 , wherein the primer set hybridizes 

2 with a peptide synthetase gene. 

1 10. The method according to claim 9, wherein the primer set comprises 

2 primers having the sequence set forth in SEQ ID Nos. 9 and 10. 

1 11. The method according to any of claims 1 to 1 0, wherein the sample 

2 comprises DNA extracted from a soil sample. 

1 12. The method according to claim 1 , wherein the sample is a lichen- 

2 derived sample. 

1 13. The method according to any of claims 1 to 12, further comprising the 

2 steps of cloning the isolated DNA into a host organism, and isolating the cloned DNA. 

1 14. The method according to claim 13, wherein the host organism is E. 

2 colL 

1 1 5. An oligonucleotide primer having the sequence as defined in any of 

2 Seq. ID. Nos. 1 through 8. 

1 1 6. A composition comprising two oligonucleotide primers having the 

2 sequence as defined in Seq. ID Nos. 1 and 2; 3 and 4; 5 and 6; or 7 and 8. 

1 1 7. A polynucleotide comprising a region having the sequence given by 

2 any of sequence ID Nos. 13, 15, 17, 19, 21, 23, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 

3 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91 or 93. 
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1 1 8. A biosynthetic polypeptide encoded by a polynucleotide comprising a 

2 region having the sequence given by any of sequence ID Nos. 13. 15, 17, 19, 21, 23, 29, 31, 

3 33, 35, 37, 39. 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77. 79 81. 

4 83,85,87.89.91 or 93. 

1 19. The biosynthetic polypeptide of claim 18, wherein the polypeptide has 

2 the amino acid sequence given by any of Sequence ID Nos. 14. 16, 18. 20. 22. 24. 26. 28. 30. 

3 32, 3,4 3.6 38. 40. 42. 44, 46, 48, 50, 52. 54, 56. 58, 60, 62, 64, 66. 68. 70. 72, 74. 76. 78. 80. 

4 82.84, 86. 88. 90. 92 or 94. 
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